Significance

Largely overlooked, the viruses of protists have started to attract more attention. Several viruses of the family Totiviridae are currently implicated in the increased pathogenicity of parasitic protozoa such as Leishmania to vertebrate hosts. We conducted a broad survey of RNA viruses within trypanosomatids, one of the iconic groups of protists. These revealed several previously unidentified viral taxa including one designated “Leishbunyaviridae” and a highly divergent virus termed “Leptomonas pyrrhocoris ostravirus 1.” Our studies provide important information on the origins as well as the diversity and distribution of viruses within a group of protists related to the human parasite Leishmania.

Abstract

Knowledge of viral diversity is expanding greatly, but many lineages remain underexplored. We surveyed RNA viruses in 52 cultured monoxenous relatives of the human parasite Leishmania (Crithidia and Leptomonas), as well as plant-infecting Phytomonas. Leptomonas pyrrhocoris was a hotbed for viral discovery, carrying a virus (Leptomonas pyrrhocoris ostravirus 1) with a highly divergent RNA-dependent RNA polymerase missed by conventional BLAST searches, an emergent clade of tombus-like viruses, and an example of viral endogenization. A deep-branching clade of trypanosomatid narnaviruses was found, notable as Leptomonas seymouri bearing Narna-like virus 1 (LepseyNLV1) have been reported in cultures recovered from patients with visceral leishmaniasis. A deep-branching trypanosomatid viral lineage showing strong affinities to bunyaviruses was termed “Leishbunyavirus” (LBV) and judged sufficiently distinct to warrant assignment within a proposed family termed “Leishbunyaviridae.” Numerous relatives of trypanosomatid viruses were found in insect metatranscriptomic surveys, which likely arise from trypanosomatid microbiota. Despite extensive sampling we found no relatives of the totivirus Leishmaniavirus (LRV1/2), implying that it was acquired at about the same time the Leishmania became able to parasitize vertebrates. As viruses were found in over a quarter of isolates tested, many more are likely to be found in the >600 unsurveyed trypanosomatid species. Viral loss was occasionally observed in culture, providing potentially isogenic virus-free lines enabling studies probing the biological role of trypanosomatid viruses. These data shed important insights on the emergence of viruses within an important trypanosomatid clade relevant to human disease.

The ability of viruses to infect virtually any cellular life form on Earth contributes to their immense diversity. While many eukaryotic groups have been probed for the viral presence, the full diversity of viruses remains to be explored (1). Especially promising is the investigation of RNA viruses in simple eukaryotes such as fungi, green algae, diatoms, slime molds, oomycetes, dinoflagellates, apicomplexans, kinetoplastids, diplomonads, and trichomonads (2⇓–4). While originally considered to be little more than evolutionary curiosities, these viruses have started to attract more attention as their important biological roles are now emerging. For example, Cryphonectria hypovirus 1 plays a key role in limiting pathogenicity to its fungal hosts, with applications toward biological control (5), and several viruses of the family Totiviridae have been implicated in the increased pathogenicity of parasitic protozoa to vertebrate hosts (6, 7).

Most studies reporting unicellular eukaryotic viruses arose from fortuitous discovery of virus-like particles (VLPs) or abundant discrete RNA segments rather than from systematic searches often termed “virus hunting.” Here we present a broad survey of RNA viruses within trypanosomatids, one of the iconic groups of protists. Members of the family Trypanosomatidae exhibit strikingly unusual molecular and biochemical traits (8⇓⇓⇓–12). Several species cause widespread severe illnesses, such as sleeping sickness, Chagas disease, and kala-azar in humans (13). Monoxenous (with one host) parasites of invertebrates (primarily insects) were ancestors of these dixenous (with two hosts) pathogens and still represent the majority of trypanosomatid lineages (14, 15). Phylogenetic analysis of the Trypanosomatidae has shown convincingly that the transition from a monoxenous to a dixenous state occurred at least three times, giving rise to the genera Trypanosoma and Leishmania (both parasites of vertebrates), as well as plant-dwelling Phytomonas (16).

VLPs were reported from a number of trypanosomatid species including Endotrypanum schaudinni, Leishmania hertigi [now classified as Paraleishmania hertigi (17)], Phytomonas spp., Crithidia pragensis, Leptomonas seymouri, Angomonas desouzai, and others (18⇓⇓⇓⇓–23). The molecular era in the research of trypanosomatid viruses began with the pioneering studies of those found in South American Leishmania spp. including Leishmania RNA virus 1 (LRV1) from Leishmania guyanensis and Leishmania braziliensis (24, 25), and an unrelated RNA virus in Phytomonas (21). The biological significance of these lays fallow until the finding that LRV1 was associated with increased disease pathology, parasite numbers, and immune response in animal models (6, 26⇓⇓–29). Subsequent studies provided evidence linking LRV1 to the severity of human leishmaniasis, including acute pathology and drug-treatment failures (30⇓⇓–33), although data relating the viral presence to the chronic mucocutaneous leishmaniasis are mixed (32, 34⇓–36).

Recently, molecular descriptions have been made for the viruses from several additional trypanosomatid species. Among them were a bunyavirus-like virus of Leptomonas moramango (37) as well as narnavirus-like viruses of Leptomonas seymouri (38) and the dixenous plant pathogen Phytomonas serpens (39). Provocatively, Leptomonas seymouri has been recovered from cultures from visceral leishmaniasis patients infected with Leishmania donovani, and many of such Leptomonas seymouri strains bear NLV1 (40). Thus, there appears to be considerable unexplored viral diversity in trypanosomatids, the study of which may contribute to our understanding of the biology of trypanosomatids and their insect and/or plant hosts as well as the origins of viruses in Leishmania.

Results and Discussion

Screening of Trypanosomatid Isolates.

We surveyed 52 isolates including 44 belonging to the genera Crithidia and Leptomonas (subfamily Leishmaniinae), as well as eight belonging to Phytomonas spp. These originated from diverse insect or plant hosts and geographic regions (Table S1). Total RNA from these isolates was digested with S1 nuclease, removing most cellular RNAs, after which the remaining dsRNA arising from dsRNA viruses or replicative intermediates of ssRNA viruses could then be sensitively detected by gel electrophoresis (see Figs. 1A, 3A, and 4A) (41). From this analysis, 11 Leishmaniinae and three Phytomonas spp. exhibited dsRNA bands, while the remainder appeared to lack them (Table 1). Most RNA segments were sequenced, and the sequences of those encoding viral RNA-dependent RNA polymerase (RDRP) were used to assign affiliations to the known viral families.

Tombus-like virus from Leptomonas pyrrhocoris. (A) Agarose gel electrophoresis of S1-digested total RNAs from strains H10 (lane 1), F19 (lane 2), and F165 (lane 3). LeppyrTLV1 segments are labeled “RNA-T1” and “RNA-T2” on the right and marked by green dots; LeppyrOV1 segments are marked by red dots. The left lane shows a 1-kb DNA ladder. (B) Genome structure of LeppyrTLV1. ORFs for different predicted proteins are shown in different colors. A 127-nt stem-loop is found within the predicted N-terminal region of ORF2. (C) Sequence of the ORF1/2 overlap region including a putative slippery sequence (yellow). The RDRP domain is predicted to start from the ACC coding for threonine as previously reported for the UUUUUA slippery sequence (43). (D) Maximum likelihood phylogenetic tree based on RDRP amino acid sequences. Host taxa are shown by symbols defined in the key for hosts. Numbers at the branches indicate Bayesian posterior probability and maximum likelihood bootstrap supports, respectively; those having a Bayesian posterior probability value of 1.0 and maximum likelihood bootstrap support of 100% are marked with black circles. (The scale bar indicates the number of substitutions per site.) The tree was rooted with the sequences of Nodaviridae. Abbreviations and GenBank accession numbers are given in Tables S2–S4.

RNA Viruses of Leptomonas pyrrhocoris.

Three of 18 isolates of Leptomonas pyrrhocoris (H10, F165, and F19) originating from various locations worldwide (42) exhibited viral dsRNA bands (Table S1). All three bore two common RNAs of 3.5 and 2.2 kb, termed “RNA-T1” and “RNA-T2” (marked by green dots in Fig. 1A), and two (H10 and F19) contained six additional bands termed “RNAs O1–O6” (marked by red dots in Fig. 1A). Sequence analysis of all RNA segments from H10 and F165 suggested the presence of two viruses. The first was distantly related to Tombusviridae. It comprised RNAs T1 and T2 and was named “Leptomonas pyrrhocoris tombus-like virus 1” (hereafter, “LeppyrTLV1”). The second virus comprising RNAs O1–O6 could not be associated with any of known viral groups and was named “Leptomonas pyrrhocoris ostravirus 1” after the city of Ostrava, where it was discovered (hereafter “LeppyrOV1”). PCR tests confirmed the presence/absence of assignments made by S1 nuclease analysis (Table 1).

LeppyrTLV1.

The sequences of segments T1 and T2 in the strains H10 and F165 were highly similar (96.7 and 97.05% nucleotide identity, respectively). RNA-T1 contained two overlapping ORFs with predicted proteins of 850 and 515 aa (Fig. 1 B and C). For ORF1, a BLAST search in the National Center for Biotechnology Information (NCBI) nonredundant protein database did not yield any hits. The ORF2 showed a clear homology to viral RDRP (cd01699 in the NCBI Conserved Domain Database, CDD) with closest relationships to positive-strand RNA viruses of the Tombusviridae/Nodaviridae group (1). The two ORFs showed an overlap of 880 nt (Fig. 1C). A putative slippery sequence, UUUUUUA, was found 6 nt into the overlap, followed by a 127-nt hairpin 6 nt further. Both elements are typical of the −1 ribosomal frameshift of various viruses (43⇓–45). These data suggest that the RDRP of this virus arises through the synthesis of an N-terminal frameshifted protein. While typical Tombusviridae encode RDRPs translated as a C-terminal extension of an upstream ORF by stop-codon read-through (46), several examples of −1 ribosomal frameshifting have been reported recently (1, 47, 48). RNA-T2 encoded a single ORF (ORF3) with a predicted protein of 455 aa (Fig. 1B), for which no homologs were identified in BLAST database searches.

Neither RNA T1 nor T2 exhibited conserved terminal sequences, which are also absent in both Tombusviridae and Nodaviridae (49, 50). Typically tombusviruses are monopartite, and the members of the related family Nodaviridae have two segments (49, 51). However, recent studies have shown remarkable variation within both groups (1).

Phylogenetic reconstruction using RDRP sequences placed LeppyrTLV1 within a clade distantly related to Tombusviridae, which usually infect plants (Fig. 1D). This clade includes viruses from invertebrates including parasitic nematodes, terrestrial myriapods, bivalves, cephalopods, freshwater crustaceans, and gastropods (Fig. 1D and Table S2) (1). Pyrrhocoris apterus, the firebug host of Leptomonas pyrrhocoris, is known to feed on the corpses of invertebrates (52), suggesting this as a possible route of acquisition.

Endogenous viral element related to LeppyrTLV1.

BLAST searches against the genome assembly of Leptomonas pyrrhocoris H10 (53) revealed that the ORF H10_02_0010 at the rightmost end of the chromosome 2 is homologous to the LeppyrTLV1 RDRP (Fig. S1A). Similar to the RNA-T1 of LeppyrTLV1, an overlapping ORF, H10_02_0020, was found immediately upstream. The overlap contained a potential slippery sequence, GGGAAAU, although we did not detect a stem-loop element thereafter (Fig. S1). The ORF H10_02_0010 and the LeppyrTLV1 RDRP shared 38% overall amino acid identity, including conservation of key RDRP motifs (Fig. S1D). Whole-transcriptome data for Leptomonas pyrrhocoris (53) confirmed transcription of both ORFs. No homology was detected between the ORF1 of LeppyrTLV1 and the predicted chromosomal protein H10_02_0020. We considered the two ORFs of the chromosome 2 as an endogenous viral element (EVE) related to LeppyrTLV1 and named it “LeppyrTLV-EVE1.”

PCR tests with primers specific to LeppyrTLV-EVE1 RDRP revealed its presence in four additional European isolates (P59, LP, PP1, and PP2), all of whose sequences were identical (Tables S3 and S4). In contrast, this EVE1 region differed by 180 nt substitutions (and 84 indels) from the corresponding part of the LeppyrTLV1 RDRP, while the TLV1 RDRP sequences of strains H10 and F165 differed by only seven nucleotide substitutions. The similarity between EVE1 and TLV1 suggests that a TLV1-like RNA was captured via reverse transcription and integration into the Leptomonas pyrrhocoris genome. EVEs occur frequently in evolution and are thought to be mediated primarily by reverse transcriptases encoded in host retroposons (54⇓–56). Indeed, a number of telomere-associated transposable element (TATE) and spliced leader-associated (SLAC) retroelements have been identified in the Leptomonas pyrrhocoris genome (53), including one located immediately upstream of the LeppyrTLV-EVE1 (Fig. S1A). The high level of sequence divergence with LeppyrTLV1 points to a relatively ancient origin of EVE1, perhaps predating the dispersal of Leptomonas pyrrhocoris across Europe (42).

LeppyrOV1.

The six RNAs O1–O6 of strains H10 and F19 (Fig. 1A) were initially viewed as “satellite” RNAs of LeppyrTLV1. However, several observations suggested that they comprise separate virus. First, unlike TLV1 RNAs T1 and T2, the termini of RNAs O1–O6 share common sequences: AAAGAAAAAA at the 5′ and ATGAGTTT at the 3′ ends (defined in the presumptive protein-coding strand orientation) (Fig. 2A). Conserved terminal sequences are known to participate in the replication of viruses and often are defining features of viral families (57). Second, in all strains the ratio of RNAs T1 to T2 was relatively constant; the same was true for RNAs O1–O6. However, the overall ratio of both RNA groups was substantially different.

Leptomonas pyrrhocoris ostravirus 1, a unique virus from Leptomonas pyrrhocoris. (A) Genome structure of LeppyrOV1, showing shared terminal sequences and a single ORF per segment (squiggle marks an incompletely sequenced end). The location of an RDRP domain predicted on RNA-O3 by CDD search, PHYRE2, and HHPred software is shown. (B) Multiple alignments of the LeppyrOV1 putative RDRP with those of Picorna-, Flavi-, and Caliciviridae. Identical residues are shown in red; similar residues are shown in blue. Amino acid motifs, typically found in viral RDRPs, are highlighted in yellow.

Segments O1–O6 each contained a single ORF, and conventional BLAST searches did not yield any homologs for the corresponding hypothetical proteins. However, search algorithms focused on both structural and sequence homology revealed a putative RDRP motif within the predicted 1,315-aa protein within segment O3 (Fig. 2A), albeit with modest statistical support (NCBI CDD, amino acids 767–870, e = 0.89; PHYRE 2, amino acids 684–874, confidence = 56%; HHPRED, amino acids 693–874, confidence = 89.7%) (58, 59). Within this region we identified conserved viral RDRP motifs responsible for catalytic activity and ribonucleotide selectivity (Fig. 2B) (60⇓–62). Analysis of the base frequencies of codon third positions of the viral ORFs showed significant differences between TLV1 and OV1 and a greater degree between these and the nuclear genome of Leptomonas pyrrhocoris (Table S5).

Thus, we conclude that RNAs O1–O6 comprise a previously undescribed virus, Leptomonas pyrrhocoris ostravirus 1 (LeppyrOV1). As yet, we have not found a trypanosomatid strain containing this virus alone, which would firmly establish its independence from LeppyrTLV1. Further studies are required to address the functional relationships between the six segments of this virus and significance of its co-occurrence with LeppyrTLV1.

A Bunyavirus-Like Genus, “Leishbunyavirus.”

Six isolates showed the presence of dsRNAs related to previously described viruses of Leptomonas moramango (37). LepmorLBV1a and b showed features characteristic of many other bunyaviruses, including a trisegmented genome, terminal “panhandle” repeats, and sequence relatedness of the predicted RDRP and nucleocapsid proteins, and were thus assigned as species within the genus Leishbunyavirus (LBV) (37). We confirmed the presence of LepmorLBV1a and 1b in Leptomonas moramango, as well as LBV1s in the dixenous phytopathogenic Phytomonas sp. TCC231 (PTCCLBV1) and four species of Crithidia: Crithidia otongatchiensis (CotoLBV1), Crithidia abscondita (CabsLBV1), Crithidia sp. G15 (CG15LBV1), and Crithidia sp. ZM (CZMLBV1) (Fig. 3A and Table 1). PCR tests with primers complementary to the conserved regions of LBV1 RDRPs showed the presence of these viruses in Crithidia sp. C4 and Crithidia pragensis as well (Table 1). These LBV1-positive strains showed three dsRNAs, except PTCCLBV1 which exhibited only two (Table 1). We sequenced all segments of CotoLBV1 and CabsLBV1 and the largest segment (completely or partially) of the others (Table 1 and Tables S3 and S4).

Leishbunyaviruses. (A) Viral dsRNA from Crithidia otongatchiensis (lane 1), Crithidia abscondita (lane 2), Crithidia sp. G15 (lane 3), and Crithidia sp. ZM (lane 4). The left lane shows a 1-kb DNA ladder. (B) Genome structure of LBVs. The sizes of segments and their various features (except for terminal complementary sequences) are shown in proportion. EN, endonuclease domain. Orange, teal, and yellow labels in the M segment stand for the signal peptide, glycosylation site(s), and transmembrane domain, respectively. (C) Negative-stain transmission electron micrographs of the virus particle isolated from Crithidia otongatchiensis. (Scale bar, 100 nm.) (D) Maximum likelihood phylogenetic tree based on RDRP amino acid sequences. Numbers at the branches indicate Bayesian posterior probability and maximum likelihood bootstrap supports, respectively; those having a Bayesian posterior probability value of 1.0 and maximum likelihood bootstrap support of 100% are marked with black circles. (The scale bar indicates the number of substitutions per site.) The tree was rooted at the midpoint. Abbreviations and GenBank accession numbers are given in Tables S2–S4. The pictograms describing viral hosts are as in Fig. 1.

Sequence features and coding potential of LBV1s.

Prototypic bunyaviruses bear three RNA segments, termed “large: (L, 7–12 kb), “medium” (M, 3.2–4.9 kb), and “small” (S, 1–3 kb), encoding RDRP, envelope glycoproteins, and nucleocapsid, respectively (57, 63). The corresponding segments in LBV1s were considerably shorter: 6–6.3 kb (L), 1.0–1.9 kb (M), and 0.7–1.0 kb (S) (Table 1). Within each completely sequenced LBV1 segment we identified a single large ORF (Fig. 3B), in contrast to many bunyaviruses, which can encode multiple ORFs on the M and S segments (57). Bunyaviral RNA segments are typically flanked by panhandle inverted repeats mediating key steps of virus replication, transcription, and translation (64). Although the methods used here did not invariably yield full-length sequences, we were often able to identify panhandles in all fragments. In the L and M segments, we identified the sequence ACACAAAG at the 5′ end (as defined by the viral sense orientation) and the complementary sequence CTTTGTGT at the 3′ end. These terminal eight nucleotides are typically found in all viruses belonging to the family Phenuiviridae (Table S6).

Various database searches (BLAST/CDD, PHYRE2, and HHpred) with the predicted proteins from the four sequenced M segments returned no hits. However, the predicted M proteins displayed a signal peptide as well as varying numbers of transmembrane domains, ranging from two in CabsLBV1, one in CotoLBV1, and zero or one in LepmorLBV1 (CCTOP, TMpred, and TMHMM algorithms), and N-glycosylation sites in CabsLBV1 and CotoLBV1 (Fig. 3B). These analyses suggest that LBV1s, much like other bunyaviruses, are able to exploit cellular secretory systems for glycoprotein synthesis and virion assembly (66, 67). Indeed, purified LBV1 virions visualized by negatively stained transmission electron microscopy displayed the typical envelope with surface projections or spikes spread evenly along its surface (Fig. 3C).

The predicted S segment proteins did not yield compelling BLASTP hits. However, PHYRE structural homology searches showed the similarity of those proteins from CabsLBV1 and LepmorLBV1b to the nucleocapsid proteins of Toscana- and Punta Toro viruses (57.2–89.3% confidence). Alignment of the predicted leishbunyaviral S proteins with the nucleocapsid proteins of other bunyaviruses revealed several universally conserved amino acid motifs (Fig. S2B). Hence, we concluded that LBV1 S segments encode nucleocapsid proteins.

Phylogenetic analysis of LBV1s suggests classification as a family Leishbunyaviridae within the Bunyavirales.

RDRP-based phylogenetic trees showed that the LBV1s formed a well-supported clade separate from other major Bunyavirales groups (Fig. 3D). The closest family was the Phenuiviridae, consistent with the similarities noted earlier in the terminal panhandle elements (Table S6). Many Phenuiviridae have been reported from insects and other arthropods, and viruses within the genus Phlebovirus are transmitted by the same sand fly species as Leishmania (68). However, our data show that LBV1s are far more ancient (Fig. 3D). As the divergence of the LBV1-containing clade from other bunyaviral families is comparable to or greater than other bunyavirus interfamilial divergences, we propose that this clade be recognized as a family, termed Leishbunyaviridae.

Identification of LBVs within metatranscriptomic viral surveys.

Interestingly, BLAST searches with trypanosomatid LBV1 RDRPs identified several hits in the sequences from metatranscriptomic virus-hunting surveys. These included Huangshi Humpbacked Fly virus (HHFV), Wuhan Spider virus (WSV) (69), Hubei bunya-like virus 5 (HBLV5) from a mix of dipterans, Hubei bunya-like virus 6 (HBLV6) from a horse leech (1), and two from honey bees—Apis bunyavirus 1 (ABV1) and Duke bunyavirus (DuBV) (70). On the reconstructed phylogenetic tree all these viruses from metatranscriptomes intermingled with the trypanosomatid LBVs, with high statistical support (Fig. 3D).

Recently, it was proposed that bunyaviruses originated within insects (71, 72), and one explanation for the interdigitation observed here could be multiple transitions of these viruses between arthropods and trypanosomatids. An alternate model is that the insect metatranscriptomic leishbunyaviruses arose not from the insects themselves but from their associated microbiota (73), given that trypanosomatids are well-known parasites of arthropods (14, 74). Thus, we searched the LBV-containing metatranscriptomic sequence read archives (SRAs) for trypanosomatid signatures, a challenging task given the relatively low number of viral reads in these pooled datasets. Nonetheless, BLASTN searches of assembled contigs revealed several abundant trypanosomatid transcripts, such as 18S rRNA or paraflagellar rod proteins (Tables S3 and S4), in the HHFV-, WSV-, HBLV5-, and ABV1-containing SRAs (data for DuBV were not available). Indeed, phylogenetic analysis of these putative transcripts revealed affinities to various trypanosomatids. Based on these data, ABV1 could speculatively be associated with Lotmaria passim (subfamily Leishmaniinae), HHFV and HBLV5 with subfamily Strigomonadinae, and HBLV6 and WSV with the genera Trypanosoma and Herpetomonas, respectively (Tables S2–S4). While the co-occurrence of reads for leishbunyaviruses and trypanosomatids in the metatranscriptomic read sets is not definitive proof that these flagellates actually contained viruses, we consider this a plausible explanation.

These findings provide support for the model postulating a trypanosomatid microbiota origin of LBVs emerging from the metatranscriptome datasets. If borne out, this suggests that, instead of multiple origins from insects, trypanosomatid LBV1s may have originated less frequently and perhaps only once. Consistent with the latter, a significant, albeit imperfect, level of phylogenetic congruency can be seen between trypanosomatid LBVs and nuclear genome phylogenies (Fig. S3). However, the possibility of multiple acquisitions of LBVs by trypanosomatids from insects or other trypanosomatids cannot be formally excluded, given that trypanosomatid LBV1s bear hallmarks of infectious bunyaviruses and reports of mixed trypanosomatid infections (75⇓⇓–78). Currently we favor a model with a single transition of an ancestral insect virus to a trypanosomatid, but further investigations will be required to rigorously establish this hypothesis.

Narnaviridae.

In Leptomonas seymouri and two isolates of Phytomonas serpens we documented the presence of dsRNA (fragments of 2.9 + 1.5 kb and 3.8 kb, respectively) (Fig. 4 A and B and Table 1), in agreement with previous findings (38⇓–40) that these species bear Leptomonas seymouri narna-like virus 1 (LepseyNLV1) and Phytomonas serpens narnavirus 1 (PserNV1).

Narnaviruses of trypanosomatids. (A, Upper) Viral dsRNA in two subcultures of Phytomonas serpens (isolate 9T). Lane 1 is from the University of California, Riverside and lane 2 is from the Institute of Parasitology in Budweis, Czech Republic. The lane marked “M” is a 1-kb DNA ladder. (Lower) Total RNA was used as a loading control. (B, Upper) Viral dsRNA in four subcultures of Leptomonas seymouri ATCC30220. Lanes 1 and 4 are the original ATCC culture; lanes 2 and 3 are substrains 2003WT and 294-1993VB (Rutgers University), respectively; lane 5 is a culture from Zoological Institute of the Russian Academy of Sciences. Leishmania guyanensis (Lgy, strain M4147) bearing the 5.3-kb LRV1 served as a positive control. The lane marked “M” is a 1-kb DNA ladder. (Lower) Total RNA was used as a loading control for virus-negative substrains. (C) Genome structure of LepseyNLV1 and PserNV1. ORFs for different proteins are shown in different colors. The stem-loop structures and terminal complementary sequences are indicated. Squiggles mark incompletely sequenced ends. (D) Maximum likelihood phylogenetic tree of Narnaviridae based on RDRP amino acid sequences. LepseyNLV1 and PserNV1 are indicated by a trypanosomatid symbol. Numbers at the branches indicate Bayesian posterior probability and maximum likelihood bootstrap supports, respectively; those having a Bayesian posterior probability value of 1.0 and maximum likelihood bootstrap support value of 100%, respectively are marked with black circles. (The scale bar indicates number of substitutions per site.) The tree was rooted with the sequences of Leviviridae. Abbreviations and GenBank accession numbers are given in Tables S2–S4. The pictograms are as in Fig. 1.

Sequence features of trypanosomatid narnaviruses.

The genome of PserNV1 was monosegmented (Fig. 4A, lane 2), and its RNA contained a single ORF for RDRP. In contrast, LepseyNLV1 displayed a bipartite organization (Fig. 4B, lanes 1, 4, and 5) with RNA1 encoding RDRP and RNA2 comprising two overlapping ORFs with no homologs identified in database searches. The region of overlap displayed several structural features associated with +1 ribosomal frameshifting, including a hairpin preceded by a slippery sequence (Fig. S4C), suggesting these two ORFs may be expressed as a fusion protein.

PserNV1 termini were determined by ligating an adapter followed by sequencing across the adapter–virus junction. They revealed features common for Narnaviridae (79): short terminal complementary sequences 5′-ACGC ... GCGT-3′ and putative subterminal hairpin structures (Fig. 4C and Fig. S4A). Intriguingly, the very 5′ end of the viral RNA showed similarity to the spliced leader (SL) of Phytomonas serpens (GenBank X87137). The SL is a 39-nt capped sequence added to the 5′ end of every trypanosomatid mRNA by trans-splicing (80, 81). However, the PserNV1 SL-related sequence lacked the first five nucleotides and had three internal mismatches (Fig. S4B), rendering it unlikely to be functional based on current knowledge of SL function (82). Thus, in the past the PserNV1 may have “snatched” the host’s SL, substituting it for the original terminus. In LepseyNLV1 we did not determine the terminal sequences explicitly; however, typical narnaviral subterminal hairpins were predicted in the RNA2 assembly (Fig. 4C).

Phylogeny and evolutionary origins.

RDRP-based phylogenetic reconstruction showed LepseyNLV1 and PserNV1 to be the closest relatives, forming a well-supported clade along with prototypical narnaviruses—Saccharomyces cerevisiae 20S and 23S viruses (Fig. 4 B and C)—as well as the oomycete-infecting Phytophthora infestans RNA virus 4 (PiRV4) (83). Interestingly, we identified a metatranscriptomic virus from the fly Teleopsis dalmanni (84), whose transcriptome assembly also contained two contigs (GBBP01074304 and GBBP01074305) corresponding to trypanosomatid 18S rRNA genes. We were not able to closely associate these contigs with known trypanosomatid sequences, suggesting that this virus may belong to a yet-uncharacterized lineage. Thus, similar to leishbunyaviruses, the insect metatranscriptomic Narnavirus may have arisen from its trypanosomatid microbiota.

As inferred earlier, Ourmiavirus and ourmia-like viruses (family Ourmiaviridae) clustered preferentially with Narnavirus, while Mitovirus (another genus of Narnaviridae) was sister to the clade comprising those three groups (85, 86). Previous studies suggested that narnaviruses were ancestral parasites of fungi, which later switched to other organisms (87, 88). Yeasts represent a normal component of insect’s intestine, where they could encounter trypanosomatids (89, 90).

While narnaviruses are typically monosegmented, Ourmiaviridae typically contain several segments (87, 88). LepseyNLV1 with its two segments exhibits an independently evolved genome organization being intermediate between those of Narnaviridae and Ourmiaviridae. While definitive evidence that the two segments represent a single virus is lacking, we consider such an association likely, given that both segments are maintained or lost in parallel as described below.

Viral Stability.

In the course of our studies we observed that upon in vitro cultivation some viruses could be occasionally lost. PserNV1 was originally found in the 9T strain from the Czech Republic; however, the same strain maintained elsewhere lacked it (Fig. 4A, lanes 1 and 2). Similarly, while LepseyNLV1 occurred in the ATCC30220 isolate, it was absent in the same strain and in a transfectant derivative obtained from another source (Fig. 4B, lanes 1–3). However, it persisted during continuous cultivation (∼300 passages) in the Zoological Institute of the Russian Academy of Sciences (Fig. 4B, lanes 4 and 5). For leishbunyaviruses, we noticed a gradual decrease of viral dsRNA levels in Crithidia otongatchiensis over 6 mo of continuous cultivation and their disappearance from Crithidia pragensis and Leptomonas moramango after 2 wk of passaging (although low levels could be detected by RT-qPCR). However, no changes in dsRNA abundance were seen for CabsLBV1, CZMLBV1, CG15LBV1, or PTCCLBV1.

Several nonexclusive mechanisms might explain these observations. Some viruses may be intrinsically unstable or lost because the selective pressures on their trypanosomatid hosts may differ in vitro and in vivo. Alternatively, the culture may be heterogeneous in terms of viral presence, and virus-free cells may outcompete their infected counterparts. Our data collectively suggest that caution is warranted when interpreting viral absence in cultured parasites. Serendipitously, virus-free derivatives may serve as isogenic tools for probing potential roles for viruses in parasite biology, as for Leishmania guyanensis LRV1 (6, 91). Indeed, the coincidental loss of LepseyNLV1 RNA1 and RNA2 provides some support for a functional association (Fig. 4B).

Conclusions

Here, we conducted a survey of RNA viruses in two groups of Trypanosomatidae: insect-restricted (monoxenous) relatives of Leishmania (Crithidia and Leptomonas, subfamily Leishmaniinae) and plant-infecting Phytomonas. This greatly expanded the known diversity of RNA viruses in these flagellates, showing that trypanosomatids can be infected by various unrelated viruses: Totiviridae, Narnaviridae, Bunyavirales, tombus-like viruses, and a previously unknown virus that was termed “Ostravirus” and is currently defined by LeppyrOV1, whose RDRP was so divergent that it escaped generic BLAST searches. We also documented EVE formation in trypanosomatids (LeppyrTLV-EVE1), presumably enabled by the activity of the endogenous retroposons.

One interesting question is whether the trypanosomatid viruses can be shed and infect other parasites. Current data suggest that LRV1, like the great majority of other Totiviridae, is not shed (92). Narnaviruses, by virtue of lacking either a capsid or an envelope, are transmitted only vertically or during mating (4). However, the presence of an extra segment in LepseyNLV1 (Fig. 4B) might be associated with transmission, as in related ourmiaviruses (85). Similarly, the two Leptomonas pyrrhocoris viruses OV1 and TLV1 have sufficient coding capacity for transmission. Last, for several trypanosomatid LBV1s we visualized the presence of enveloped virions bearing surface proteins (Fig. 3C), the hallmarks of infectious bunyaviruses. These fascinating questions will be addressed in future studies.

Phylogenetic relationships of relevant trypanosomatid taxa permit a broader view of the origins and evolution of their viruses (Fig. 5). First, Leptomonas pyrrhocoris appears to be a hotbed for viral discovery, with two previously unreported viruses (LeppyrTLV1 and LeppyrOV1) and the presence of an EVE. Secondly, narnaviruses, LBV1s, and LRV1/2s appear to be distributed over the trypanosomatid phylogenetic tree in a patchy manner, with many seemingly virus-free lineages interspersed with ones bearing diverse viruses (Fig. 5). This poses a number of challenges. If one postulates the presence of virus in the common ancestor of a particular group (marked by arrows in Fig. 5), viral loss must have occurred independently in a great many subsequent taxa. Alternatively, if one assumes the common ancestor to be virus-free, independent viral acquisitions must have occurred. The chances of this are speculative at best, perhaps being more likely for the viruses showing increased likelihood for infectivity (LBV1s and, conceivably, OV1, TLV1, and NLV1). Superimposed upon or alternative to this is the possibility of viral exchange via infectious shedding during coinfections, as mixed trypanosomatid infections are quite frequent (see above). Importantly, these latter two processes would be expected to further blur signs of virus–parasite coevolution. Thus, it is remarkable that for LRV1/2 (93) and, to some extent, for LBVs (Fig. S3), the phylogenetic trees for the parasite and their viruses show significant congruency. This suggests that there must be some constraints on the horizontal viral transmission, if present, especially among kingdoms.

Overview of trypanosomatid relationships and viruses. The evolutionary tree shows the maximum likelihood phylogenetic tree of trypanosomatids reconstructed using 18S rRNA and gGAPDH genes, over which the absence or presence of viruses is marked (see graphical legend). Arrows denote hypothetical acquisition of viruses under the assumption of a single origin in the common ancestor. Maximal bootstrap supports are marked by filled circles, and bootstrap supports over 70% are denoted by open circles. (The scale bar indicates the number of substitutions per site.)

Notably, in our survey we did not find any LRV-related Totiviridae (Fig. 5 and Table 1), although numerous Leishmaniinae were tested. This suggests that these viruses were acquired upon the involvement of vertebrates into the life cycle of Leishmaniinae. Given the elevated pathogenicity of LRV1-bearing Leishmania to the vertebrate host, viral acquisition could be viewed as beneficial for the parasite, if one equates pathogenicity with an increased evolutionary fitness. However, most Leishmania except Viannia and a handful of Leishmania major or Leishmania aethiopica isolates lack LRV1/2 (94). This implies that if LRV1/2 presence was indeed beneficial early in evolution, it became less important in modern lineages and/or was replaced by other mechanisms contributing to virulence, such as type I IFN induction (29).

Variation in the RNAi pathway may contribute to the observed patchiness in viral distribution, as this pathway acts as an antiviral defense mechanism in many species (95). In agreement with this, the RNAi pathway (believed to be ancestral to all eukaryotes) is absent in Phytomonas spp., Leptomonas seymouri, and LRV2-bearing Leishmania aethiopica and Leishmania major (23, 96, 97). The RNAi pathway may be especially important for narnaviruses, which are presumably defenseless because of the lack of capsids. However, LRV1-containing Leishmania guyanensis and Leishmania braziliensis have a highly active RNAi pathway (97), and accordingly LRV1 has mastered the ability to coexist in the face of RNAi attack, although under some circumstances RNAi can lead to its elimination (27). In addition, numerous Crithidia and Leptomonas spp. retain the RNAi pathway (96). It is thus possible that RNAi plays only a weak role in the evolutionary distribution of trypanosomatid viruses.

Several studies have established a role for trypanosomatid viruses in the vertebrate host (6, 27⇓–29, 40, 98). Our studies now suggest that potential role(s) for trypanosomatid viruses in parasite biology within their insect hosts should be considered. While LRV1 and other Totiviridae have been implicated in vertebrate pathogenicity (6, 7, 35), there are no direct data concerning the influence of this virus on the relationships between Leishmania and sand flies. Given that Toll-like receptors were first discovered in insects (99), and TLR3 specifically was implicated in LRV1 pathogenicity (6), this possibility clearly merits attention. Alternatively, viruses may invade and persist as mere parasitic elements rather than providing any advantage to their trypanosomatid hosts. Resolution of these questions may benefit from the serendipitous identification of virus-free isolates of Phytomonas serpens and Leptomonas seymouri and their use in studies assessing potential functional roles.

Previously unidentified viruses were found in considerable numbers in the species/isolates tested (Table 1). The actual diversity of trypanosomatids is not known, but at least 600 species have been already described (100). In addition, the example of Leptomonas pyrrhocoris with its multiple isolates showing variation in viral presence and composition, illustrates another level of diversity. Indeed, as noted in Fig. 5, there are several trypanosomatid lineages for which VLPs have been reported but not studied by modern molecular methods. Furthermore, in various invertebrate metatranscriptomes we found several viruses possibly originating from their trypanosomatid microbiota. Such metatranscriptomes may also provide important new information about the diversity of trypanosomatids themselves. Taken together these findings suggest that a great number of viruses remain to be found in this important group of parasites.

Methods

Isolation of Viral RNA and Primary Screening.

Total RNA was isolated from trypanosomatid cultures using the TRI reagent (MRC Inc.) as described previously (101). For primary screening, 50 µg of total RNA from each sample were treated with RNase-free DNase I (New England Biolabs) and nuclease S1 from Aspergillus oryzae (Sigma-Aldrich) (41). The resulting dsRNA was resolved on 0.8% agarose gel and stained with ethidium bromide. For preparative isolation 400 µg of total RNA from virus-positive cultures were digested with DNase I, followed by ssRNA precipitation by LTS solution (2 M LiCl, 150 mM NaCl, 15 mM Tris HCl, pH 8.0) at 4 °C overnight as described previously (102). The ssRNA fraction was removed by centrifugation for 30 min at 20,000 × g at 4 °C, and dsRNA was precipitated by EtOH and visualized as above. Individual dsRNA bands were gel-purified using Zymoclean Gel RNA Recovery Kit (Zymo Research).

Viral dsRNA Amplification, Cloning, and Sequencing.

Gel-extracted dsRNA was polyadenylated at both 3′ ends using Escherichia coli Poly(A) Polymerase (New England Biolabs) and then purified on a PCR clean-up column (Thermo Fisher Scientific) according to the manufacturer’s protocol. Next, polyadenylated dsRNA was reverse-transcribed using the Transcriptor First Strand cDNA Synthesis Kit (Roche), and an anchored-oligo (dT) primer QD2-T20 5′-ggcaattaaccctcactatagaattcgttcgatctttttttttttttttttttt-3′ (modified from ref. 103). To prevent renaturation of the cRNA strands, DMSO was added to a final concentration of 7.5%, and the residual RNA was lysed with 0.1 M NaOH. The cDNA was then reannealed at 65 °C for 90 min followed by gradual cooling to 4 °C (102). The cDNA was purified on PCR clean-up columns and amplified using Phusion High-Fidelity DNA Polymerase (Thermo Fisher Scientific) with primer QD2, 5′-tcactatagaattcgttcgatc-3′, that anneals to the fragment introduced by QD2-T20. The first PCR step that included end repair (72 °C for 5 min) was followed by the manufacturer’s recommended cycling conditions: 98 °C for 10 s, 55 °C for 30 s, and 72 °C for 40 s/kb. Obtained PCR products were cloned into the pTZ57R vector (Thermo Fisher Scientific) and sequenced by primer walking. For the analysis described above, we were unable to obtain enough dsRNA from Crithidia sp. C4 and Crithidia pragensis. In these cases, the partial RDRP gene (∼900 bp) was amplified using degenerate primers designed to amplify known LBV1s (LeiBunyaF: 5′-ttykcvacnttcaagaaragcac-3′ and LeiBunyaR: 5′-ccagartcatcwgadgadaccat-3′) (ref. 37 and this work), and the products were cloned into the pTZ57R vector and sequenced. To assess the presence of the Leptomonas pyrrhocoris RNA virus, total cDNA of all Leptomonas pyrrhocoris isolates (both positive and negative as judged by gel-based assay) was amplified with primers LpTLV1F 5′-ttactcctataacggggca-3′ and LpTLV1R 5′-taaaggagcgaattctgct-3′ specific to the RDRP region (∼300 bp) of this virus and directly sequenced. Similarly, the occurrence of integrated virus in these isolates was checked by amplification using primers LpIVF 5′-cctatgcggatgcactcaa-3′ and LpIVR 5′-cttgtgcattttctatccaag-3′. PCR Primers M200 5′-atggctccvvtcaargtwggmat-3′ and M201 5′-takccccactcrttrtcrtacca-3′ for the glycosomal GAPDH (gGAPDH) gene were used as an internal positive control (104). Additional methods for cultivation of trypanosomatids, phylogenetic, genomic and transcriptomic analyses, and negative-stain transmission electron microscopy can be found in the Supporting Information.

Acknowledgments

We thank members of our groups and D. Wang and S. Krishnamurthy (Washington University School of Medicine in St. Louis) for discussion and D. Maslov, L. Simpson, and V. Bellofatto for providing Phytomonas and Leptomonas strains. This work was supported by NIH Grants R01-AI029646 and R56-AI099364 (to S.M.B.), the Intramural Research Program of the National Library of Medicine at the NIH (I.B.R.), Grant Agency of Czech Republic Awards 17-10656S (to V.Y. and J.V.) and 16-18699S (to J.L. and V.Y.), Moravskoslezský kraj Research Initiative DT01-021358 (to V.Y. and A.Y.K.), the Russian Foundation for Basic Research Project no. 15-29-02734 and the State Assignment for the Zoological Institute AAAA-A17-117030310322-3 (to A.O.F.), and the European Cooperation in Science and Technology (COST) action CM1307 (to J.L.). Work in the V.Y. laboratory is financially supported by the Ministry of Education, Youth and Sports of the Czech Republic in the National Feasibility Program I, project LO1208 TEWEP. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Reviewers: S.A.G., University of Kentucky; M.L.N., Harvard Medical School; and L.S., University of California, Los Angeles.

The authors declare no conflict of interest.

Data deposition: The deposition and accession numbers for the sequences reported in this work are summarized in Table S3. Metatranscriptomic contigs assembled from single-read archive depositions, which cannot be deposited in GenBank because they have not been experimentally verified, are listed in Table S4.

You May Also be Interested in

For too long, the considerable importance and impacts of recreational fisheries have been ignored. Policymakers and managers need to do a better job acknowledging and addressing this very influential sector.

Fossil evidence helps address a longstanding debate on the evolution of hagfish, a jawless, marine-dwelling slime “eel,” and suggests that living jawless vertebrates may not be as primitive as their anatomy suggests.