Affiliations
Department of Microbiology, Immunology and Pathology, College of Veterinary Medicine and Biomedical Sciences, Colorado State University, Fort Collins, Colorado, United States of America,
Division of Vector-Borne Diseases, Centers for Disease Control and Prevention, Fort Collins, Colorado, United States of America

Figures

Abstract

Within hosts, RNA viruses form populations that are genetically and phenotypically complex. Heterogeneity in RNA virus genomes arises due to error-prone replication and is reduced by stochastic and selective mechanisms that are incompletely understood. Defining how natural selection shapes RNA virus populations is critical because it can inform treatment paradigms and enhance control efforts. We allowed West Nile virus (WNV) to replicate in wild-caught American crows, house sparrows and American robins to assess how natural selection shapes RNA virus populations in ecologically relevant hosts that differ in susceptibility to virus-induced mortality. After five sequential passages in each bird species, we examined the phenotype and population diversity of WNV through fitness competition assays and next generation sequencing. We demonstrate that fitness gains occur in a species-specific manner, with the greatest replicative fitness gains in robin-passaged WNV and the least in WNV passaged in crows. Sequencing data revealed that intrahost WNV populations were strongly influenced by purifying selection and the overall complexity of the viral populations was similar among passaged hosts. However, the selective pressures that control WNV populations seem to be bird species-dependent. Specifically, crow-passaged WNV populations contained the most unique mutations (~1.7× more than sparrows, ~3.4× more than robins) and defective genomes (~1.4× greater than sparrows, ~2.7× greater than robins), but the lowest average mutation frequency (about equal to sparrows, ~2.6× lower than robins). Therefore, our data suggest that WNV replication in the most disease-susceptible bird species is positively associated with virus mutational tolerance, likely via complementation, and negatively associated with the strength of selection. These differences in genetic composition most likely have distinct phenotypic consequences for the virus populations. Taken together, these results reveal important insights into how different hosts may contribute to the emergence of RNA viruses.

Author Summary

Viruses are constantly emerging into new areas and pose significant challenges to public health. Chikungunya and West Nile viruses (WNV), both mosquito-borne RNA viruses, are quintessential examples of how increased globalization has facilitated the expansion of viruses into new territories. Rapid evolution of both of these agents has contributed to their rapid spread and health burden. Thus, characterizing how selection shapes zoonotic RNA viruses in their natural hosts is important to understand their emergence. As an ecological generalist able to infect hundreds of bird species, WNV is an excellent tool to study how different animal hosts can differentially drive virus evolution. We examined the genetic composition and fitness of WNV produced during replication in wild-caught American crows, house sparrows and American robins, species that range in mortality following WNV infection (crows the highest, robins the lowest). We demonstrate host-dependent effects on WNV population structure and fitness. Our study provides insights on how different virus-animal interactions can influence the success of a virus in the next host and ultimately the success of virus emergence into new host systems.

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication

Data Availability: All sequencing data have been deposited in the NCBI Short Read Archive database and can be accessed using the BioProject alias PRJNA281547 "West Nile virus raw sequence reads from experimentally infected wild-caught birds" (accession no. SRP57419).

Funding: This work was funded by a grant from the National Institute of Allergy and Infectious Disease, National Institutes of Health under grant number AI067380 (GDE). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

RNA viruses pose some of the most complex, persistent and challenging problems facing public health and medicine. The ongoing outbreaks of avian influenza A(H7N9) virus (Orthomyxoviridae) in China [1], Ebola virus (Filoviridae) in West Africa [2], and chikungunya virus (CHIKV, Togaviridae, Alphavirus) and West Nile virus (WNV, Flaviviridae, Flavivirus) in the Americas [3,4] highlight the health and societal impacts imposed by RNA virus-induced diseases. Several factors contribute to the emergence of these agents and the continued burdens they impose on human health. Among these is their ability to undergo rapid evolution in new and/or changing environments. Well documented examples of RNA virus evolution leading to increased virus transmission include WNV and CHIKV. In both cases, small, conservative amino acid substitutions (residues with similar physiochemical properties) to the viral envelope proteins resulted in more efficient transmission by mosquito vectors [5,6]. Adaptive changes to RNA virus genomes first arise as minority components within a genetically complex population of related but non-identical virus variants. The genetic diversity present in naturally occurring RNA virus populations has been clearly shown through a large and expanding body of observational and experimental studies to be critical to their biology. For example, several studies have demonstrated that the diversity of an intrahost viral population, rather than the fitness of individual variants, correlates with pathogenesis, disease progression and therapeutic outcome [7–9]. Moreover RNA viruses have the capacity for rapid evolutionary change because within infected hosts, all single nucleotide mutations may be generated.

This has been particularly clear in the case of WNV, an arthropod-borne virus (arbovirus) that persists in nature in enzootic cycles between ornithophilic mosquitoes (mainly Culex spp.) and birds. After its initial identification in the New York City area in 1999, WNV spread throughout the continental United States, producing the largest outbreaks of flaviviral encephalitis ever recorded in North America. The explosive spread of the virus was accompanied by the displacement of the introduced genotype by a derived strain that is more efficiently transmitted by local Culex mosquitoes [10]. Studies of intrahost population dynamics of WNV demonstrated that genetic diversity is greater in mosquitoes than in birds [11]. The selective basis for the host-specific patterns of WNV genetic diversity is that the strong purifying selection that predominates in birds is relaxed in mosquitoes [11,12]. In addition, the RNA interference-based antiviral response in mosquitoes creates an environment where negative frequency-dependent selection may drive rare variants to higher population frequency [13]. Moreover, WNV maintains both adaptive plasticity and high fitness by alternating between hosts that impose different selective forces on the virus population [14].

Nonetheless, important gaps remain in our understanding of how error-prone replication interacts with selective and stochastic reductions in viral genetic diversity under natural conditions. This is particularly the case for arboviruses, which tend to cause acute infection in vertebrates, with transmission occurring before the development of a neutralizing antibody response. Therefore, well-described mechanisms of immune selection such as those that occur during chronic hepatitis C and human immunodeficiency virus infections are comparatively weak during acute arbovirus infection of vertebrates. Thus, the ways that ecologically relevant, natural hosts can influence arbovirus genetic diversity remain poorly understood. WNV in particular provides an excellent experimental system to study the influences of natural vertebrate hosts on viral evolution. The virus infects a large number of wild bird species [15] with a wide-range of infection outcomes [16]. In addition, several studies have provided evidence that particular WNV variants may arise through adaptation to birds [17,18].

Therefore, we sought to determine whether different wild bird species may have distinct impacts on WNV population structure. Specifically, we allowed WNV to replicate in wild-caught American crows (Corvus brachyrhynchos), house sparrows (Passer domesticus), and American robins (Turdus migratorius), bypassing the mosquito portion of the arbovirus cycle in order to focus on the impact of different vertebrate environments on virus populations during acute infection. Virus was passaged in individuals of each species five times in order to amplify host-specific patterns of selection that may remain cryptic after a single passage. Bird species were selected on the basis of ecological relevance and resistance to WNV-induced mortality. American crows experience high viremia and mortality following inoculation with WNV [19] and can directly transmit virus to roost mates without mosquito involvement [20]; house sparrows experience high viremia and intermediate mortality [21] and are frequently involved in WNV perpetuation [22]; and American robins experience intermediate viremia but very low mortality [23] and can be drivers for human WNV risk [24]. Virus populations were characterized using next generation sequencing (NGS) and through in vivo fitness competition studies in birds and mosquitoes. Our findings demonstrate that relevant vertebrate hosts with varying levels of disease susceptibility differentially shape WNV population structure with direct impacts on fitness during host shifts.

Results

Virus passage and phenotypic assessment

The WNV used in these studies was derived from an infectious clone of the NY99 genotype and is described in detail elsewhere [25]. Clone-derived WNV was passaged five times in wild-caught American crows, house sparrows and American robins. To avoid systematically selecting high- or low-replicating strains and population bottlenecks during passage, and since titers are highly variable in wild-caught birds, the sera from the individuals with the intermediate viral load were passed into the next cohort at a standard dose of 1000 plaque forming units (PFU). Virus titer was variable but did not change significantly or consistently during the course of passage (Fig 1A). Further, five passages in wild birds did not alter viremia production or mortality in crows and sparrows (S1A and S1B Fig). WNV replication and fitness after passage was assessed using young chickens and Culex quinquefasciatus mosquitoes to directly compare the viral populations in hosts not used for passaging and to remove the variability of wild-caught birds (e.g. age and infection history) (Fig 1B and 1C). Passaged virus (p5) was similar to the WNVic (p0) in peak viremia production in chickens (i.e. at 2 and 3 dpi) (Fig 1B).

Fitness assays were used to directly compare passaged viruses to a standard reference WNV in head-to-head competition. These assays can detect subtle fitness differences that are inapparent in comparative studies. Competitive fitness of all wild-bird p5 WNV was significantly enhanced in chickens. Crow-passaged virus had the smallest fitness gains and robin-passaged virus the largest (Fig 1C). Fitness studies conducted in wild birds produced the same results as those in chickens (S1C Fig). Competitive fitness was slightly increased in mosquitoes, but no bird-specific differences were noted (Fig 1C, S1D Fig).

Patterns of intrahost mutational diversity

At each passage virus was examined by NGS to determine whether the consensus sequence changed during passage and to characterize the diversity of intrahost viral populations (S1 Table, S2 Fig). WNV genome coverage was variable across the genome and between samples (S2A Fig), and positively correlated with viral population size (S2C Fig). The lower relative WNV genome coverage from robin sera can in part be explained by smaller intrahost viral population sizes and smaller virus to host RNA ratios. Approximately 68%, 29% and 7% of NGS reads aligned to the WNV genome from crow, sparrow and robin sera, respectively. Comparatively, 20% and 0.5% of the NGS reads aligned to the WNV genome from chicken sera and mosquito bodies, respectively.

Three nucleotide mutations that led to consensus amino acid substitutions were detected though passaging in birds, but none became fixed (i.e. frequency = 1) in the population. In contrast, three consensus amino acid substitutions were detected after a single mosquito passage. All intrahost single nucleotide variants (iSNVs) > 0.02 frequency are listed in S2 Table.

We estimated intrahost variation from NGS data to determine whether WNV population diversity was bird species-dependent. The mean number of unique iSNVs in each virus population was relatively constant between passages, but differences were apparent among bird species (Fig 2A). WNV populations passaged in crows five times (p5) had significantly more unique iSNVs than WNV passaged in sparrows and robins. In addition, the frequency of individual iSNVs increased during passage in a species-dependent manner: The mean iSNV frequency after p5 in robins was significantly higher than after p5 in crows or sparrows (Fig 2B). Despite these differences, the viral populations had similar Normalized Shannon entropies (SN), Hamming distances (i.e. SNVs per coding sequence) and amino acid substitutions per coding sequence after p5 in different species (Fig 2C).

We examined the ratio of viral genome equivalents (GE) to PFUs and intrahost single nucleotide length variants (iLVs, including both insertions and deletions) to assess defective viral genomes in WNV populations during passage. Crow-passaged WNV had the highest GE:PFU ratio (Fig 3A) and the most unique iLVs (Fig 3B). In addition, a greater proportion of the iLVs in crows were found in subsequent passages compared to sparrows and robins (Fig 3C). The number of iLVs per coding sequence was positively correlated with the titer of infectious virus (Fig 3D). We then evaluated the possibility that greater levels of iLV carry though in crows, which can only occur via complementation (Fig 3C), were due to sampling artifacts. To do this, we used a hypergeometric test implemented in R that indicated that selecting 400 common iLVs in two samples of 600 from the total pool of available single-nucleotide iLVs (n = 51,490) was 0. Simulation studies confirmed that it is extremely unlikely that random sampling produced the observed data.

Intrahost selective pressures

Evidence for natural selection was assessed in WNV populations using intrahost neutrality tests. The proportion of mutations in each population that were nonsynonymous (pN) and the ratios of nonsynonymous to synonymous variants per site (dN/dS) were highest in the input p0 WNV population and decreased significantly during passage in each bird species (Table 1). Separate analysis of dN and dS shows that dN did not significantly increase during passage while dS increased significantly at p5 in all bird species, a hallmark of purifying selection. The Fu and Li’s F and Fay and Wu’s H statistics were obtained from reconstructed haplotypes. The F statistic at p1 and p5 was consistently negative, indicating that the haplotypes contained excessive amounts of rare SNVs, again indicative of purifying selection (Table 1). The H statistic measures an excess of high compared to intermediate frequency SNVs. The insignificant H values suggest that the deviations from neutrality were due to natural selection rather than selective sweeps (Table 1).

Analysis of reconstructed haplotypes that arose during passage and high frequency iSNVs (i.e. frequency > 0.02) was conducted to minimize the impact of differences in sequencing coverage and to assess positive selection. 0.02 was selected as a cutoff for “high frequency” mutations because it includes the top 5% of a gamma distribution of all VPhaser2-accepted iSNVs. The proportion of iSNVs that were high frequency after p5 was the greatest within robin-passaged WNV populations (16.5%) compared to sparrows (4.9%) and crows (4.8%) (Fig 4A). Reconstructed haplotypes from high frequency iSNVs were then used to assess the selective pressures that lead to haplotype replacement during passage (Fig 4B). The ancestral p0 virus population was composed of a single dominant haplotype that remained dominant after a single passage in all bird species. After p5, the ancestral haplotype remained dominant in crows, but not in sparrows and robins. Furthermore, high frequency iSNVs from crows contributed significantly fewer amino acid substitutions per coding sequence compared to robins after p5 (Fig 4C). Examination of dN/dS, amino acid diversity and high frequency nonsynonymous iSNVs across the WNV genome demonstrated that, in general, selection was the strongest in the structural protein coding regions (Fig 4D and 4E). Specifically, passage in robins imposed significant selective pressures on the envelope (E) protein coding region that heavily targeted ectodomains (ED) I and II. The apparent selection of the nonstructural protein 4B (NS4B) from sparrow passaging is the result of a single high frequency nonsynonymous iSNV (S2 Table). Individual high frequency iSNVs fluctuated in frequency through passaging and all nonsynonymous high frequency iSNVs were unique to its passage lineage (i.e. no “signature mutations” were detected that served as markers for replication in any particular bird species, see S2 Table).

Interhost genetic divergence

The standardized variance in iSNV frequencies (FST) was then estimated from the coding sequence to determine the degree of genetic divergence among replicates within a passage and between passages (Fig 5). Viral populations from robins were more divergent compared to those from crows and sparrows. FST from WNV passaged once in young chickens was similar to wild-caught birds, but WNV passaged once in mosquitoes was much more divergent. These results are supported by analysis of haplotypes (S3 Fig). The p0 haplotype was still dominant in chicken p1 populations with a small minority of haplotypes containing single iSNVs, similar to wild birds (Fig 4B). In mosquitoes the ancestral haplotype became a minority after a single passage.

Discussion

Virus passage and phenotypic assessment

We examined WNV genetic diversity during the course of passage in birds that experience varying mortality due to WNV infection to assess how different hosts influence virus population structure and fitness. Passage in each host was accomplished in three concurrent biological replicates in order to control for the impact of individual wild-caught birds that may vary in several ways that could impact virus replication. Titers during passage were highly variable between individuals. However, mean titers did not significantly change during the course of passage, indicating that replication competence was retained and that overt increases in competitive fitness were not selected through our passage strategy.

Wild-bird passaged virus was similar to unpassaged WNV in viremia production. Only when more sensitive in vivo competitive fitness assays (i.e. comparative replication of the passaged and reference WNV in the same host) were conducted were changes apparent. Note that our definition of fitness here is restricted to the specific competition environment (within the bird or mosquito) and does not consider the larger ecological fitness required for maintenance in a complex arbovirus transmission cycle. Passage in all birds resulted in significant competitive fitness gains during replication in chickens. Interestingly, the fitness gains were smallest after WNV was passaged in the host that experiences the most mortality (crows), and largest in the most disease-resistant avian host (robins). Fitness gains were far less clear when virus competition was measured in mosquitoes. A limitation to our mosquito studies is that competition was conducted via intrathoracic inoculation, which bypasses the midgut, a major physiological barrier in mosquitoes. Intrathoracic inoculation was used because the volume of blood available and the virus titers would have likely made oral infection highly inefficient. Importantly, our results on WNV replication and fitness are supported by previous observations [14] indicating that high fitness is maintained through purifying selection in vertebrates, and that no tradeoff occurs when the virus is re-introduced into mosquitoes. Moreover, replicative fitness increases occur during passage in ecologically relevant wild birds, and these gains occur in a species-specific manner.

Patterns of intrahost mutational diversity and selective pressures

To investigate the viral genetic and population determinants of the observed fitness gains, we characterized WNV at each passage using NGS. Our data suggests that although the overall complexity of the virus population was similar among different bird species, its composition, and the selective pressures that produced it appear to be bird species-dependent. Interestingly, WNV replication in the most disease-susceptible bird species seems to be positively associated with the number of unique iSNVs (i.e. mutational tolerance) and negatively associated with iSNV frequency (i.e. strength of selection). This observation requires further investigation using additional resistant and susceptible birds, but may provide important insights into which bird species are most likely to drive virus evolution toward fitness gains. Our data thus far suggests that more disease resistant birds such as robins would be most likely to fill this role as long as they produce sufficiently high titers to infect mosquitoes.

In this study we used various neutrality tests to determine whether intrahost WNV populations from each bird species were evolving non-randomly through purifying selection. While these tests all measure slightly different aspects of genetic diversity, all clearly demonstrate purifying selection in birds. This result confirms previous studies of WNV passaged in young chickens [11], and indicates that our approaches to sequencing and analysis, although they differ significantly from those reported previously, produce results consistent with other methods.

Our studies also provide some evidence for positive selection during bird infection. We found that WNV passage in robins resulted in more amino acid substitutions that reach high frequency compared to crows. In addition, the ancestral haplotype tended to be displaced by novel mutants that arose during passage in sparrows and robins. These data suggest that positive selection within hosts is stronger in less susceptible bird species [26].

Examination of patterns of variation across the WNV genome provides additional evidence for differences in host selective environment. We found, consistent with previous reports on dengue virus populations [27], the highest variant frequencies in ectodomains I and II of the E coding sequence of WNV passaged in robins. The mechanisms that lead to the emergence of these variants are not currently clear. Although the E protein contains most neutralizing epitopes, the earliest neutralizing antibody responses observed in birds generally occur at around 5 to 7 days post infection [23,28]. Other mechanisms that could impact selection on the E protein include resistance to the early antiviral states induced by type I interferon [29,30] and alternate methods for virus entry and uncoating of the viral RNA [31]; though these mechanisms need further investigation, especially in birds. Our results suggest that in relatively resistant hosts, novel variants may rise to high frequency within the context of purifying selection. The notion that positive selection occurs in robins is further supported by our data showing that virus diverged most during replication in them. It is, however, balanced by a lack of evidence of a selective sweep, i.e. a rapid reduction in genetic diversity as a novel variant becomes very prominent in the population. Clearly further studies are needed to confirm whether and how positive selection contributes to WNV population structure in birds.

Defective genomes

Compared to other RNA viruses, arboviruses have low long-term rates of amino acid substitution [32]. This is at least partially due to the fact that most mutations are deleterious because of evolutionary constraints on arbovirus genomes [33]. We provide evidence that accumulation of deleterious mutations, or defective viral genomes, is unequal between hosts; WNV populations replicating in wild-caught crows accumulate the most defective genomes, and WNV replicating in robins accumulate the least. Defective genomes are often found during laboratory and natural virus infections [17,34] and can persist through multiple rounds of transmission [35,36]. Using both bioassays (i.e. GE:PFU) and sequencing data (i.e. iLVs per coding sequence), we found that the accumulation of WNV defective genomes during infection was positively correlated with viral load. This apparent density-dependent selection of deleterious mutations likely occurs via functional complementation, which becomes more efficient as effective multiplicity of infection (MOI, i.e. intrahost viral load) increases [37,38]. In addition, high MOI environments tend to tolerate neutral mutations that can become deleterious in a new environment [39]. Taken together, these studies provide a framework to understand how WNV replication in high-viremic crows leads to a broader network of potentially deleterious mutations and limited selection for adaptive amino acid substitutions, especially when compared to WNV replication in robins. The rather modest fitness gains experienced by crow-passaged WNV support this observation.

Conclusions

The results presented here shed light on the selective forces that shape WNV populations in nature. We demonstrate that selective pressures that control WNV populations seem to occur in a species-specific manner (Fig 6). All three bird species evaluated have been suggested to be significant drivers of WNV outbreaks, with robins receiving particular attention due to findings indicating that this species is more frequently fed upon by mosquito vectors [24]. During intrahost WNV replication, our studies suggest that disease-susceptibility is positively associated with mutational tolerance and negatively associated with the strength of selection. This means that robins also may better maintain high fitness in WNV populations than do birds that are more susceptible to disease. While it is tempting to speculate that robins are significant generators of WNV genetic diversity, we also confirm herein that mosquitoes are much more efficient in generating mutational diversity in the WNV system. Moreover, these data suggest that intrahost virus evolutionary dynamics are associated with host resistance to disease in several ways and provide an important insight towards the genetic and ecological factors that influence RNA virus emergence.

Host mortality and intrahost WNV population sizes are associated with WNV population structure and competitive fitness. The WNV populations from all bird species contain ~1 mutation per genome. However in the crow environment, WNV populations are more tolerant of unique and deleterious mutations (e.g. insertions and deletions), but few mutations rise to high frequency. In the most disease-resistant bird species, robins, the WNV populations are under stronger selection pressures. Robin-associated WNV populations are less tolerant of unique and deleterious mutations, and more mutations reach high frequency. The selective environment of more disease-resistant birds was also positively associated with competitive fitness in young chickens, but not in mosquitoes. Population size: each “virus” represents a log10 of GE/ml. Mutant spectra: “X” represents deleterious mutations, “diamonds” represent neutral or advantageous mutations, and diamonds of the same color represents the same mutation.

Materials and Methods

Ethics statement

Wild birds were collected from under US Fish and Wildlife Service (#MB91672A-0) and Colorado Parks and Wildlife (#13TRb2106) permits and with permissions from landowners. No endangered or protected species were caught or harmed during the study. Experiments involving animals were conducted in accordance with protocols approved by the Colorado State University (CSU) Institutional Animal Care and Use Committee (#12-3694A) and the recommendations set forth in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health.

Serial passage of WNV in wild-caught birds

A WNV infectious clone (WNVic) was previously constructed from an American crow kidney isolate collected during the 2000 outbreak in New York City [25,40]. The WNVic contains a naturally selected proline at amino acid site 249 in nonstructural protein 3 (NS3) allowing it to replicate to high titers in wild birds [18,41]. Wild birds were collected in Northern Colorado from 2013 to 2014 using mist nets (house sparrows and American robins) and cannon nets (American crows). All birds were bled prior to inoculation and serum was tested by plaque reduction neutralization test to confirm that all birds used for subsequent studies were WNV seronegative. The virus strain used to initiate the passage series was derived from a WNVic as previously described [25]. Virus was harvested from the supernatant of BHK cells transfected with linearized plasmid, stored at -80°C and used without further passage. Viruses were administered to birds by subcutaneous inoculation to the breast region with 1,000 WNV PFU/100 μl, a dose similar to mosquito transmission [42], in inoculation medium (endotoxin and cation-free phosphate buffered saline with 1% FBS). Birds were bled from the jugular vein at the time of peak viremia on 3 days post-infection (dpi). Serum was titered by standard plaque assay on African green monkey kidney cells (Vero, ATCC CCL-81) and stored at -80°C until used for subsequent passage or sequencing as described below. The first passage series utilized seven birds for each wild-caught species and the three birds with the median viral titers were used to start three independent replicate lineages, each including three naïve birds (i.e. replicates ‘a’, ‘b’, and ‘c’). From each group of three birds, the serum with the median viral titer was used to continue passaging to another cohort until five serial passages were completed. The WNVic derived virus was also passaged once in three young chickens for 3 dpi and two individual Cx. quinquefasciatus mosquitoes for 14 dpi to compare viral populations from commonly used laboratory vertebrate host and invertebrate vector models, respectively. See S1 Text for information about housing and care of wild-caught birds, chickens and mosquitoes.

Phenotypic assessment

The infection phenotype of each WNV lineage after five passages (p5) in wild-caught birds was compared to the unpassaged (p0) WNV in the same bird species as virus passage, young chickens (two-days old), and Cx. quinquefasciatus mosquitoes (4–7 days post emergence). Viremia and survival was measured from birds were inoculated with 1,000 PFU of p5 or p0 WNV (n = 4–5 birds/virus) for up to 6 dpi. As defined here, competitive fitness compares the replication of a competitor virus (i.e. serial passaged p5 WNV) and a standard WNV reference (WNV-REF) during infection of the same host. Competitive fitness is quantified by the proportion of competitor to WNV-REF genotypes using sequence chromatograms (i.e. quantitative sequencing) [43]. WNV-REF was generated from an infectious clone as described above and in S1 Text and is indistinguishable from the parental virus in replication in cells and relevant organisms [44]. Competitive fitness assays of co-inoculated birds and mosquitoes with equally mixed WNV-REF and p5 competitor virus was conducted as described in S1 Text.

Sequencing and data analysis

Virus libraries were prepared for RNA sequencing on the Illumina HiSeq 2000 platform (Beckman Coulter Genomics, Danvers, MA) using the NuGEN Ovation RNA-Seq System V2 and Ultralow Library kit (San Carlos, CA) (See SI Text for more details). Fastq files containing read data were demultiplexed using CASAVA and custom scripts that impose high stringency (0 mismatches) in the barcode region of each read. The sequence of the input WNV strain was determined from three independent biological sequencing replicates of the input virus using the Trinity assembler [45]. 100 nt paired-end reads were then aligned to this “input” sequence using MOSAIK [46]. Duplicate reads were removed using the MarkDuplicates tool within Picard to limit the influence of PCR artifacts and multiply sequenced clusters on variant calling with Vphaser2 [47]. Variants with significant strand bias were removed to reduce the potential for false-positives [48]. Variants called using Vphaser2 were used for subsequent data analysis unless otherwise specified. Analysis was limited to the protein coding sequences; and iSNVs and iLVs (includes both insertions and deletions) were analyzed separately.

Hamming distances from the p0 “input” virus were calculated for each population by dividing the total number of polymorphisms by the average coding sequencing coverage. Mean viral population complexity was calculated by the SN at each site using the following equation [49]:
where p is the frequency of the iSNV at site i and N is the coverage at that site. At a single nucleotide position, a SN score of 0 indicates a single nucleotide was present (i.e. no polymorphism) while a score of 1 represents maximum complexity (i.e. equal numbers of alternate nucleotides). The SN at all protein coding sequence nucleotides loci were averaged to estimate the viral population complexity.

High frequency iSNVs were subjected to an additional analysis to reduce the possibility that conclusions drawn from the complete dataset were dependent on extremely rare variants. To establish a threshold for “high frequency” iSNVs, all of the Vphaser2 accepted variants detected in this study (n = 6052) were log10 transformed, increased by 3.75 (to make all of the values positive) and fit to a gamma distribution, where α = μ2/s2 and β = E[μ]/s2, using R (data did not fit a beta distribution). An iSNV frequency >0.02 was determined to be in the upper 5% of the gamma distribution and was used to define high frequency SNVs detected through WNV passage in birds (n = 341 individual SNVs). The sequencing reads from p0, p1 and p5 were aligned to the WNV genome using mpileup from the VarScan2 software package [50] and haplotypes were reconstructed using QuasiRecomb 1.2 [51] with the flags ‘-r 97–10395’, to reconstruct haplotypes from the entire coding sequence with respect to reference genome numbering, ‘-K 1–10’, to use a bigger interval of generators and ‘-noRecomb”, to disable the recombination process because it was not expected from the viral population and to reduce the runtime. To increase haplotype specificity, the flag ‘-conservative’ was employed and analysis was restricted to haplotypes containing high frequency SNVs (i.e. >0.02).

pN and dN/dS were used to test for intrahost selection [33]. DnaSP (version 5) [52] was used to determine the number of nonsynonymous and synonymous sites to calculate dN/dS using the Nei-Gojorori method [53] with the following modifications for NGS data. Nd and Sd (i.e. the numbers of detected nonsynonymous and synonymous mutations, respectively) were calculated for each viral population by the sum of individual nonsynonymous and synonymous VPhaser2 accepted iSNV frequencies and the passage consensus sequence was used to determine the number of nonsynonymous and synonymous sites. The number of nonsynonymous (7843.67) and synonymous (2455.33) sites in the ancestral p0 consensus sequence were used to determine that pN prior to selection is ~ 0.76. In addition, 50 most frequent haplotypes reconstructed from p1 and p5 from each bird species were analyzed using the Fu and Li’s F [54] and Fay and Wu’s H [55] statistical tests of neutrality in DnaSP with a window length of 100, a step size of 25 and the p0 consensus sequence as an outgroup to infer the ancestral nucleotide state.

FST was used to estimate the extent of interhost genetic divergence using a scale between 0 and 1, and the extent of FST change between populations represents the degree of genetic divergence. Specifically, in-house FORTAN scripts were used to calculate FST using equations 1, 2 and 4 by Fumagalli et al. [56]. Intrahost SNV frequencies determined by mpileup and readcounts from the VarScan2 software package [50] were used to estimate the per site heterozygosity in biological replicates compared to the total population (e.g. all biological replicates within passage) at a single passage (i.e. intra-passage) and the per site heterozygosity between passage replicates (i.e. inter-passage).

For estimation of the probability of resampling for the iLV data, we used the phyper command in R (www.R-project.org). We calculated that a total of 51,490 single nucleotide iLVs were possible by multiplying the length of the coding sequence (10,299 nt) by the 5 different kinds of iLVs that could occur at each site (one deletion and four different nt insertions). We then used phyper to obtain the probability of sampling overlap of 400 iLVs out of 600 sampled (reflecting a reasonable approximation of our observed data for crows) given that 51,490 iLVs are possible. Simulation studies were conducted in R by randomly sampling 600 individuals, with replacement, from a set of 51,490 and comparing the sets. T-tests, Kruskal Wallis tests, and correlation statistics were obtained using R and GraphPad Prism (La Jolla, CA).

Haplotypes were reconstructed from the high frequency iSNVs (i.e. > 0.02) from input WNV and after one passage in young chickens and Culex quinquefasciatus mosquitoes and are represented by the number of single nucleotide variants (SNVs) per haplotype (Hamming distance from the p0 haplotype).

Acknowledgments

The authors acknowledge the students and staff from the Animal Disease Laboratory at CSU for providing space and technical advice for wild-caught bird infections. We also thank B. Dodd, A. Prasad, S. Garcia Luna, S. Sieke, E. Doster and E. Chiu of the CSU Arthropod-Borne and Infectious Diseases Laboratory for technical support, and S. Lozano, B. Andre and A. Hess for analysis assistance. Lastly, we thank Colorado Parks and Wildlife and several private landowners for allowing bird trapping on their property.