Abstract

Little is known about the rate at which genetic variation is generated within intrahost populations of dengue virus (DENV) and what implications this diversity has for dengue pathogenesis, disease severity, and host immunity. Previous studies of intrahost DENV variation have used a low frequency of sampling and/or experimental methods that do not fully account for errors generated through amplification and sequencing of viral RNAs. We investigated the extent and pattern of genetic diversity in sequence data in domain III (DIII) of the envelope (E) gene in serial plasma samples (n = 49) taken from 17 patients infected with DENV type 1 (DENV-1), totaling some 8,458 clones. Statistically rigorous approaches were employed to account for artifactual variants resulting from amplification and sequencing, which we suggest have played a major role in previous studies of intrahost genetic variation. Accordingly, nucleotide sequence diversities of viral populations were very low, with conservative estimates of the average levels of genetic diversity ranging from 0 to 0.0013. Despite such sequence conservation, we observed clear evidence for mixed infection, with the presence of multiple phylogenetically distinct lineages present within the same host, while the presence of stop codon mutations in some samples suggests the action of complementation. In contrast to some previous studies we observed no relationship between the extent and pattern of DENV-1 genetic diversity and disease severity, immune status, or level of viremia.

Minimum spanning networks of intrahost DENV-1 sequence data (VP data set). Each network was inferred by compiling sequences from multiple days. The number in the upper left corner of each panel corresponds to the patient number; percentages indicate the probability of parsimony used to construct the network. Haplotypes with the high ancestral probability are displayed as circles. Circle sizes are proportional to the number of sequences that exhibit each variant, and the pie chart in each circle indicate the percentage of each variant at different time points. Connecting lines indicate a single mutation shared among haplotypes. (A and B) Minimum spanning network in which multiple viral lineages were observed across time points (patients 82 and 162). (C) Minimum spanning network in which one mutation was shared between haplotypes (patient 309). (D) Minimum spanning network with star-like typology (patient 59). (E) Minimum spanning network with reduced parsimony probability (patient 336).

Minimum spanning networks of intrahost DENV sequence data (VP data set) in which one (B, E, G, J, and K) and/or two (C, I, and J) mutations were shared between haplotypes. All sequences were identical to the consensus in panels A, D, F, H, and L. Refer to Fig. 1 for more information.

Maximum-likelihood (ML) phylogenetic tree for all (n = 89) consensus sequences derived from clones in the VP data set in relation to 1,390 equivalent background DENV-1 sequences collected from GenBank. Red lines represent clones from sample G2542, and red arrows signify mixed infection. Clades are indicated as numbers. Horizontal branches are drawn to a scale of nucleotide substitutions per site, and the tree is midpoint rooted; nodes are ordered increasingly and presented as a polar tree.