Figures

Abstract

Giardia duodenalis, originally regarded as a commensal organism, is the etiologic agent of giardiasis, a gastrointestinal disease of humans and animals. Giardiasis causes major public and veterinary health concerns worldwide. Transmission is either direct, through the faecal-oral route, or indirect, through ingestion of contaminated water or food. Genetic characterization of G. duodenalis isolates has revealed the existence of seven groups (assemblages A to G) which differ in their host distribution. Assemblages A and B are found in humans and in many other mammals, but the role of animals in the epidemiology of human infection is still unclear, despite the fact that the zoonotic potential of Giardia was recognised by the WHO some 30 years ago. Here, we performed an extensive genetic characterization of 978 human and 1440 animal isolates, which together comprise 3886 sequences from 4 genetic loci. The data were assembled into a molecular epidemiological database developed by a European network of public and veterinary health Institutions. Genotyping was performed at different levels of resolution (single and multiple loci on the same dataset). The zoonotic potential of both assemblages A and B is evident when studied at the level of assemblages, sub-assemblages, and even at each single locus. However, when genotypes are defined using a multi-locus sequence typing scheme, only 2 multi-locus genotypes (MLG) of assemblage A and none of assemblage B appear to have a zoonotic potential. Surprisingly, mixtures of genotypes in individual isolates were repeatedly observed. Possible explanations are the uptake of genetically different Giardia cysts by a host, or subsequent infection of an already infected host, likely without overt symptoms, with a different Giardia species, which may cause disease. Other explanations for mixed genotypes, particularly for assemblage B, are substantial allelic sequence heterogeneity and/or genetic recombination. Although the zoonotic potential of G. duodenalis is evident, evidence on the contribution and frequency is (still) lacking. This newly developed molecular database has the potential to tackle intricate epidemiological questions concerning protozoan diseases.

Author Summary

Giardia duodenalis is a parasite causing a gastrointestinal disease in humans, pets, livestock, and wildlife. The role of animals in human disease is unclear, because Giardia from humans and animals is morphologically indistinguishable. An international consortium of both veterinary and public health institutions built a web-based database, where molecular and epidemiological data are combined. After extensive genetic characterization, the zoonotic potential of Giardia became evident, but data on frequency and role in epidemiology is (still) lacking. Surprisingly, mixtures of Giardia genotypes in individual hosts were frequently observed, and have important implications for the etiology of Giardiasis. Possible explanations are the uptake of mixtures of Giardia genotypes by one host, or subsequent infection of an already infected host, likely without overt symptoms, with a different Giardia species, which may cause disease. We demonstrated that collaborative, human and veterinary health integrated databases have the potential to tackle intricate epidemiological questions concerning parasitic diseases, as was demonstrated for G. duodenalis in the present study.

Funding: This study was partially funded by the European MED-VET-NET project contract FOOD-CT-2004-506122, and by the Dutch Food and Consumer Product Safety Authority (VWA). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Giardia is a genus of intestinal flagellates that infect a wide range of vertebrate hosts. The genus consists of six species, which are distinguished on the basis of the morphology and ultra-structure of their trophozoites [1]. Giardia duodenalis (syn. G. intestinalis, G. lamblia) is the only species found in humans, although it exhibits a wide host range being found in many other mammals. G. duodenalis is the etiological agent of giardiasis, a gastrointestinal infection in humans ranging from asymptomatic to severe diarrhea as well as chronic disease [2]. Giardiasis represents a major public health concern in both developing and developed countries [3],[4]. The economic losses, both direct and indirect, caused by this widespread parasitic infection are considerable. Children are at most risk from the clinical consequences of G. duodenalis infection, particularly those in developing countries and living in disadvantaged community settings [5]. In population- and general practitioner-based studies in The Netherlands, G. duodenalis was identified as the most important gastrointestinal parasitic pathogen [6],[7]. Paradoxically, the diagnosis of giardiasis is not routinely carried out, due to lack of awareness and the similarity of symptoms with other gastro-enteritis diseases. G. duodenalis is also of significant clinical and economic importance in livestock and pet animals [8]–[10].

Giardia has a simple life cycle comprising rapidly multiplying, non-invasive trophozoites on the mucosal surface of the small intestine, and the production of environmentally resistant cysts that are passed with the host faeces. Infectious cysts are transmitted by the faecal-oral route, either by direct contact or by ingestion of contaminated food or water [11]. Illness from this parasite arises through infection in two broad settings: outbreaks and (sporadic) endemic transmissions. Outbreaks are most frequently waterborne and caused by contamination of drinking water, although other transmission routes have been implicated as well [1],[12],[13]. One complicating factor is that the number of asymptomatic carriers, and their role in the spread of the infections, are not clear [6],[7],[12],[14].

G. duodenalis can be considered as a species complex, whose members show little variation in their morphology, yet can be assigned to seven distinct assemblages (A to G) based on genetic analysis [15]. Assemblages A and B are responsible for human infection, and are also found in a wide range of mammals. The remaining assemblages show more restricted host ranges: C and D are found in canids, E in livestock, F in cats, and G in rodents [16]. Genetic characterization has been extensively used to assess the role of animals in the epidemiology of human infection and to develop tools for tracing sources of infection. However, the zoonotic potential of G. duodenalis is still under debate, particularly the role of domestic animals. Transmission may occur from animals to humans or from humans to animals. Alternatively, humans and animals may be infected with host-adapted genotypes only. For example, transmission of Giardia from beavers to humans via drinking water was postulated [17],[18]. In endemic areas where humans and animals live closely together, transmission from human to animals or vice versa may occur [19],[20]. Also, the existence of host-adapted Giardia genotypes has been reported [21],[22]. Until now, the majority of molecular epidemiological studies have been based on the analysis of a single marker from a limited number of isolates. Furthermore, the genetic variability and the usefulness of the different loci in identifying genotypes have not been systematically evaluated. Finally, it remains unclear to what extent allelic sequence heterozygosity (ASH) and genetic exchanges contribute to the genetic variation found in Giardia[23]. In this study the zoonotic potential of G. duodenalis is investigated at different levels of resolution (single and multiple loci on the same dataset). Zoonotic potential is defined here as a G. duodenalis genotype, which has been isolated from both human and animal, sources, and doesn't take into account other epidemiological parameters (such as time and geographical origin).

A European network of public and veterinary health Institutions from 9 European countries that focuses on zoonotic protozoan parasites (the ZOOnotic Protozoa NETwork, ZOOPNET) has been established (Sprong et al. submitted) as part of MedVetNet, a European network of excellence working for the prevention and control of zoonoses and food borne diseases. The aims of ZOOPNET were (i) to harmonize the methodology for the detection and control of Giardia and Cryptosporidium, (ii) to investigate the molecular epidemiology of these infections, and (iii) to study the role of animal sources in human disease. A molecular epidemiological database was built in the course of the project, currently containing information on 2476 Giardia isolates, which encompass 3886 sequences, and on 1024 Cryptosporidium isolates, for a total of 1664 sequences. The ZOOPNET-database differs from a representative (e.g. Genbank) or a genomic (e.g. GiardaDB) database [24], as it aims to collect epidemiological data linked to a few molecular markers from as many field isolates as possible. A field isolate can be described best as a DNA sample isolated from a human, animal or environmental source. This implies that an isolate may contain more than one G. duodenalis species or genotypes. A part of the database is already publicly available (https://hypocrates.rivm.nl/bnwww/MedVetN​et/). Currently, a more user-friendly web-based database which not only contains all the molecular epidemiological data used in this study, but also allows public and veterinary health researchers to BLAST their sequences in the database, to perform basis phylogenetic analysis and to submit their own data into the database.

In the present study, the genetic diversity and geographic distribution of G. duodenalis of human and animal origin, and the potential for zoonotic transmission, were assessed by different molecular genotyping methods.

Methods

Origin of the isolates

Giardia isolates of human and animal origin were collected by Public and Veterinary Health Institutions from the European countries represented in the network, as well as and from external research groups on a voluntary basis. Epidemiologic and molecular data were submitted using an Excel-based file, and form the basis of the information present in the database (Sequences and data used for this study are available on request). Furthermore, Giardia sequences were retrieved from the Genbank database. A selection of these sequences was made using the same strategy as previously described [25]. For example, sequences that were too short to cover regions of variation within any given assemblage were used only for analysis at the level of that assemblage, but not at the level of sub-assemblage. In addition, when multiple, identical sequences from any given isolate were deposited in Genbank, only the longest available sequence was retrieved. Although Genbank sequences constitute ~45% of the database, limited epidemiological data (mainly country and source of isolation) are available for those isolates. All molecular epidemiological data were stored and analysed in Bionumerics (Version 5.10; Applied Math, Belgium). The contents of the database (February 2009) are described in the supporting information (Text S1).

Sequence analysis

All of the G. duodenalis sequences were derived from genomic DNA. Most of the sequences were obtained from direct sequencing (occasionally cloned) of PCR products amplified from faecal samples. The sequences of reference isolates originated from laboratory strains, which were grown previously in culture or passaged through suckling mice. Each isolate was characterized using one to four of the most commonly employed genetic markers, which corresponds to portions of the small subunit ribosomal DNA (SSU-rDNA), beta-giardin (BG), glutamate dehydrogenase (GDH), and triose phosphate isomerase (TPI) genes [25]. All sequences were sorted into their different genes, assemblages, and sub-assemblages as well as alignments along the gene using previously defined references (Text S1). All of these markers, with the exception of the SSU-rDNA, have a high, though variable degree of genetic polymorphism [25], and were used to define sub-assemblages and subtypes. Sequences that were too short, or that contain ambiguous nucleotides which prevent their assignment to specific assemblage were excluded from further analysis [25]. Subtyping at the GDH locus was complicated by the use of different primers that amplify different portions of the gene, with only a partial overlap. In order to minimize these transitivity dilemmas, cluster analysis for each locus was performed using Unweigthed Pair Group Method with Arithmetic mean (UPGMA) and “most identical matches” as first and secondary criterion, respectively. A secondary criterion will be applied if two equivalent solutions will emerge from the first criterion.

The four markers used in this study are unlinked in the G. duodenalis genome, at least in the genome of assemblage A [22], which is a prerequisite for a multi-locus sequence typing scheme. The following G. duodenalis isolates were used as references for multi-locus sequence typing: for assemblage A, sub-assemblage AI, the axenic strains WB, Portland 1 and Ad-1 [26],[27]; for assemblage A, sub-assemblage AII, the axenic strains Bris-162, Bris-136 and KC8 [28]; for assemblage A, sub-assemblage AIII the isolate ISSGdA614 [22]; for assemblage B, sub-assemblage BIII, the strains BAH12 and Ld18 [26],[29]; and for assemblage B, sub-assemblage BIV, the strains Ad28 and Nij5 [26],[29].

Results

The zoonotic potential of G. duodenalis can be inferred by comparing the genotypes of human and animal isolates. Here, genotyping was performed at different levels of resolution. First, for each marker all sequences were assigned to specific G. duodenalis assemblages (A to G) by comparison with previously defined sequences of reference strains [15]. Second, sequences of assemblages A and B were assigned to sub-assemblages AI, AII, AIII, BIII, and BIV using single-nucleotide polymorphisms (SNPs) from reference strains which were identified previously [22]. Third, since considerable sequence heterogeneity was also found within each sub-assemblage, subtypes (AS001, AS002, AS003, etc) were assigned to groups of sequences, on the basis of their similarity [22],[30],[31]. Fourth, genotyping data from different loci were combined to perform a multi-locus analysis.

Typing at the assemblage level

The sequences from each of the four markers obtained from 2476 Giardia isolates were assigned to G. duodenalis assemblages A to G by comparison with previously defined reference strains (Text S1). The distribution of the assemblages within each source (corresponds to host or host group) was determined (Table 1). In humans (n = 1658), assemblage A (43%), B (56%) and to a much lesser extent C (0,1%), D (0,2%), E(0,2%), and F (0,2%) were found [32],[33]. All of these assemblages were also found in animals. Thus, at this very low level of resolution assemblages A to F can be considered zoonotic. The relative host range of a specific assemblage is calculated as the distribution of the sources within each assemblage (Table 2). The presented calculation does not take the absolute numbers of the sources in a population (e.g. number of cats compared to the number of humans in Europe) and the prevalence of giardiasis of each source into account. Still, assemblages C and D were mainly found in dogs (Table 2), assemblage E in livestock, F in cats and G in rodents (beavers and rats). These results are in agreement with previous findings [11],[16]. Remarkably, the host distribution of assemblage B is predominantly human and to a much lesser extent wildlife and dog (Table 2).

Table 2. Relative distributions of sources in percentage within each assemblage.

doi:10.1371/journal.pntd.0000558.t002

The host distribution of assemblage A is less restricted than B, where companion animals (29%) livestock (27%) and wildlife (22%) have a comparable prevalence of assemblage A as in humans (19%). This result suggests that humans are the major source of assemblage B, but that domestic animals play a major role in the host range of assemblage A.

For those isolates which were characterized at two or more loci (n = 908), the assignment to a specific assemblage obtained at one locus was inconsistent with that obtained at another locus in 13% of them (Table 3). Similar results have been reported in previous studies, using the same markers as those in the present study [20],[25],[34]. This finding was particularly frequent in isolates from dogs (~34%) where, depending on the markers used, isolates are typed as either host-adapted assemblages C and D, or as assemblage A and B (Table 4). Also in ~12% of the human isolates (n = 392) mixing of assemblages was observed between A and B. As sexual recombination between different assemblages has not been unequivocally demonstrated [23],[35], these cases are more likely to represent mixed infections.

Table 4. Combination of mixed assemblages found in individual isolates.

doi:10.1371/journal.pntd.0000558.t004

Typing at the sub-assemblage level

Sub-groups within assemblages A and B were originally defined by isoenzyme analysis of laboratory-adapted strains, and classified into AI and AII, BIII and BIV [28]. Importantly, other subgroups were observed in a more recent study also based on isoenzyme analysis, and some appear to be host specific [15]. DNA sequence analysis of a smaller number of these isolates confirmed the existence of these subgroups in assemblage A and B at different loci [15]. More recently, a third sub-assemblage within assemblage A (referred to as AIII) was identified, and appears to be specifically associated with wild hoofed animals [22],[36],[37]. The SSU-rDNA locus showed too little variability among assemblage A and B isolates to perform analysis at the sub-assemblage level, whereas sufficient genetic variation was observed at the other three loci [22]. In companion animals and in livestock infected with assemblage A, approximately three quarter of the sequences corresponded to sub-assemblage AI, and the remaining quarter to sub-assemblage AII (Table 5). The opposite was found in human isolates: approximately one quarter of the sequences was identified as sub-assemblage AI and three quarter as sub-assemblage AII. The AIII sub-assemblage was mostly found in wildlife, a few cows and in a single cat isolate, but never in humans. In human isolates with assemblage B, sub-assemblage BIII and BIV were found with a very similar frequency (Table 6). In some wild animals (beaver, muskrat), sub-assemblage BIV was predominantly found. Monkeys and marine animals [22],[38],[39],[40],[41], which together represent the majority of the category “others”, were both infected with sub-assemblage BIII and BIV. Thus, at this level of resolution, G. duodenalis sub-assemblage AI, AII, BIII and BIV are potentially zoonotic, whereas sub-assemblage AIII is found exclusively in animals.

Table 6. Distribution of sub-assemblages BIII and BIV in different sources.

doi:10.1371/journal.pntd.0000558.t006

The geographic distribution of sub-assemblages AI and AII in humans and companion animals/livestock was compared. In companion animals/livestock infected with assemblage A, the majority was sub-assemblage AI, and the minority was sub-assemblage AII (Table 7). This distribution was found globally, suggesting that sub-assemblage AI has a preference for companion animals/livestock. Except for Asia and Australia, the opposite was found in humans: the majority was sub-assemblage AII, and the minority was sub-assemblage AI. These data show that the three G. duodenalis sub-assemblages A predominantly/preferentially cycle within defined hosts (AI in livestock, AII in humans, AIII in wildlife), and that these cycles do not interact significantly. The geographic distribution of sub-assemblages BIII and BIV in humans showed marked differences between continents. In Africa, infection with G. duodenalis assemblage B, sub-assemblage BIII is more prevalent (81%) than infection with sub-assemblage BIV (19%), whereas the opposite is found in North-America where 86% of infections are associated with sub-assemblage BIV, and only 14% with sub-assemblage BIII (Table 8). A more balanced distribution is found in Europe and Australia.

The finding of a mixture of assemblages in a significant fraction of individual isolates prompted us to investigate whether this occurred at the level of sub-assemblages. In isolates analysed at two or more loci, sub-assemblage results obtained at the different loci were compared. Mixtures were found between AI and AII, and between AI and AIII. No mixtures were detected between AII and AIII. Within assemblage A, 5.4% of mixtures were observed between sub-assemblages AI and AII. Remarkably, mixtures between BIII and BIV characterized 30.3% of the isolates. Analysis of human isolates showed that an infection with AI alone occurs as often as an infection with a mixture of AI and AII (Table 9). A similar situation occurred with sub-assemblage BIII and BIV: an infection with BIV occurs as often as an infection with a mixture of BIII and BIV.

Single versus multi-locus typing at the isolate level

Sequence heterogeneity was also observed within each sub-assemblage, and those genetic variants are referred here as subtypes. In order to determine the zoonotic potential at this level, subtypes were assigned to groups of sequences, on the basis of similarity [22],[30],[31]. Thus, sequences that differ for a single nucleotide difference defined two subtypes. For example, at the SSU-rDNA locus, 15 subtypes were found among assemblage A isolates (Table 10). Of these, 3 and 7 subtypes were exclusively found in humans or in animals, respectively, whereas 5 subtypes contained both human and animal isolates. Notably, these 5 subtypes correspond to 92% of the isolates (humans and animals). Genetic variability at each of the other three loci defined several subtypes (between 3 and 18) in both assemblages A and B, and, as subtypes comprises both human and animal isolates, it is possible to infer a zoonotic potential. Subtypes were also determined for assemblages C to F. The subtypes of assemblage C, D and E found in a few human isolates did not match any of the subtypes found in animals. However, several subtypes of assemblage F found in humans at the BG locus were identical to subtypes found in cats [32].

Table 10. Potential zoonotic subtypes using one, two or three markers.

doi:10.1371/journal.pntd.0000558.t010

In order to increase the accuracy of genotyping of isolates at this level, subtypes from two or three loci were combined to define multi-locus genotypes (MLGs). 41 sequences, which could not be unequivocally assigned at the level of assemblage, were excluded from the analysis. Combining SSU-rDNA and BG was possible for 33 isolates of assemblage A, and defined 11 MLGs (Table 10). With this combination only one MLG of assemblage A was potentially zoonotic. The combination of SSU-rDNA and BG for assemblage B also generated a single potentially zoonotic MLG out of 20 MLGs. This MLG was found in 3 out of 46 isolates of assemblage B. The same approach was used for all possible combinations of the 4 markers (Table 10). When using two markers, the number of potentially zoonotic subtypes and the percentage of corresponding isolates decreased significantly. Still, potential zoonotic subtypes of both assemblage A and B were found when using two markers. When subtypes from three loci are combined, two MLGs of assemblage A are potentially zoonotic, and none of assemblage B. These cases have been described before. In Italy, an isolate from a cat (ISSGdA107) has a MLG belonging to sub-assemblage AII [22]. Human isolates from Belgium, Germany, The Netherlands, Italy, France, Nicaragua, and Australia, have the same MLG. The other case is based on two axenic strains that have a MLG belonging to sub-assemblage AI These two isolates, Portland and Ad-1, were originally isolated from human patients in the USA and Australia, respectively [15]. Remarkably, the animal (mostly cattle) isolates having this MLG are from Canada, Italy and Sweden.

There are several technical explanations for the relatively low number of zoonotic MLGs as defined using three loci. Most importantly, the number of isolates typed at this level is still relatively small compared to the number of subtypes defined. Furthermore, most MLGs are from human isolates, particularly for assemblage B. Indeed, for many animal isolates of assemblage A or B, only one or two markers were sequenced, and, in some cases, the mixture of zoonotic and non-zoonotic assemblages prevents an unambiguous identification of the MLGs. An alternative, but less accurate, approach for the identification of potential zoonotic MLGs is to combine the zoonotic information of subtypes of individual markers (Table 10, row 1–4). Isolates with 3 markers (BG, GDH and TPI) were considered as potentially zoonotic when all three markers were found to be zoonotic individually. For assemblage A, 36% (n = 101) of isolates with 3 markers was found to be zoonotic. For assemblage B, 4% (n = 56) was potentially zoonotic (Table 11).

Sequences containing ambiguous nucleotides

The presence of heterogeneous sequencing profiles (characterized by two overlapping nucleotide peaks at specific positions) has been reported in several papers from different research groups [19],[22],[32],[42]. Besides the quality of the sequencing reaction itself, two explanations can be given for the presence of those mixed profiles: allelic sequence heterozygosity (ASH) and mixed infections. Giardia has two diploid nuclei, which may accumulate specific mutations independently, and this generates ASH [23]. The fact that G. duodenalis isolates display a very low level of ASH, initially based on the analysis of few isolates and genetic loci [35],[43], has been confirmed by the analysis of the complete WB genome, a strain belonging to assemblage A, sub-assemblage AI [43]. Albeit limited by the small number of loci, and by the difficulty in distinguishing ASH from mixed infections, the data presented in Table 12 clearly shows that heterogeneous sequencing profiles occur much more often in isolates of assemblages B, C, and D than in those from assemblage A, E and F. The number of heterogeneous positions also varied among the loci analysed and the positions involved often coincide with polymorphic sites among different subtypes.

The occurrence of ASH complicates the assignment of isolates to specific subtypes, especially for assemblage B. Therefore, the occurrence of zoonotic subtypes within assemblage B was tested after the exclusion of ambiguous nucleotides. The BG, GDH, and TPI sequences from a total of 117 assemblage B isolates (100 from humans, and 17 from animals) were merged and clustered. No zoonotic subtypes were detected. When all isolates (n = 199) typed with 2 markers (BG-GDH, BG-TPI, or GDH-TPI) were included in the analysis, 7% were compatible with zoonotic potential. Interestingly, these isolates were from zoo animals and a rabbit.

Genetic heterogeneity

A measure of the genetic diversity of a locus can be estimated by the number of subtypes corrected for the number of isolates. This was achieved by dividing the number of isolates without ambiguous nucleotides (Table 13) by the number of subtypes. The lowest genetic variability was found at the SSU-rDNA locus. Although 15 subtypes were identified at SSU-rDNA for both assemblage A and B, sequence variation, no distinction could be made between sub-assemblages. Most of the sequence variation found at SSU-rDNA was caused by a minority of the isolates. The genetic variability of the other 3 markers varied only a little from each other. Remarkably, the genetic variability at each marker in assemblage A subtypes was ~2-fold lower than that found in assemblage B subtypes. The genetic distance within assemblage A was higher than within assemblage B (Table 13).

Phylogenetic analysis of assemblages A and B

The multi-locus analysis of field isolates may not represent G. duodenalis genotypes as they could consist of a mixture of several G. duodenalis (sub)species. To identify multi-locus genotypes among isolates of assemblage A, the sequences of the BG, GDH, and TPI loci from isolates with matching assignment were merged, a multiple alignment was generated and trees were constructed using complete linkage. To increase the accuracy of the analysis, only multi-locus genotypes found in more than one isolate were selected. In total 9 MLGs were identified from 84 isolates for assemblage A (Figure 1). To evaluate the robustness of the inferred relationships within assemblage A, trees were also generated from each marker. The clustering generated from the individual markers was congruent with the clustering of multi-locus profile (Table 14). These analyses confirmed the existence of three monophyletic sub-assemblages at each marker. However, the sequence variation at each locus was too low to discriminate between the different subtypes within sub-assemblage AI and AII. For example, subtype AI-1 cannot be distinguished from AI-3 with GDH, and AI-1 is identical to AI-2 when using BG and TPI. Two genotypes were identified, AI-III, and AII-II, which contained both human and animals isolates, which is in agreement with the MLGs identified previously (Table 10).

Phylogenetic trees of 84 isolates with 3 markers were inferred using Unweighted Pair Group Method with Arithmetic mean, corrected by complete linkage, which uses the lowest similarities found between two clusters. Individual and merged BG, GDH and TPI nucleotide sequences were used. Bootstrap values were calculated by the analysis of 1000 replicates. Only bootstrap values >60 are shown. The phylogenetic analysis of assemblage A shows that the three sub-assemblages clustered together with high bootstrap support (i.e., they are monophyletic). The genetic diversity of the multi-locus genotypes (isolates/subtypes: 9,3) is relatively low (see Table 13), and the maximum genetic distance is 4,0%.

A similar analysis was performed for assemblage B isolates. In total 31 genotypes were identified from 65 isolates (Figure 2). The clustering generated from individual markers was able to discriminate sub-assemblage BIII from BIV, but with low bootstrap values, especially for BG. However, multi-locus genotyping of assemblage B was inconsistent with genotyping at the sub-assemblage level: significant mixing (~30%) of BIII and BIV was observed. In contrast to assemblage A, clustering from individual loci of assemblage B was incongruent with clustering of multiple loci (Table 14). These results are consistent with the multi-locus subtyping of isolates: In assemblage A, mixing is less frequently observed than in assemblage B (Table 9). Removal of the “mixed MLGs” from the genotyping analysis did not alter the outcome of the analysis significantly: The bootstrap values as well as the congruency remained low (not shown). Compared to assemblage A, the MLG diversity (number of genotypes) of assemblage B is 4 times higher, but their genetic distance is two times lower, both at the level of individual markers and at the level of MLG (Figures 1 and 2).

Phylogenetic trees of 65 isolates of assemblage B with 3 markers were constructed using Unweighted Pair Group Method with Arithmetic mean, corrected by complete linkage, which uses the lowest similarities found between two clusters. Individual and merged BG, GDH and TPI nucleotide sequences were used. Bootstrap values were calculated by the analysis of 1000 replicates. Only bootstrap values >60 are shown. The phylogenetic analysis of the merged sequences shows significant mixtures of the BIII and BIV sub-assemblages. The genetic diversity of the multi-locus genotypes (isolates/subtypes: 2,1) is relatively high (see Table 13), and the maximum genetic distance is 1,7%.

doi:10.1371/journal.pntd.0000558.g002

Discussion

In the present study, the zoonotic potential, genetic diversity, and the geographic distribution of G. duodenalis genotypes from the ZOOPNET-database were assessed. Accurate molecular typing is imperative for unraveling the intricate epidemiology of giardiasis. Molecular markers should be able to discriminate between morphologically identical isolates that may differ for important properties, like virulence and host-specificity. The genes used in this study have housekeeping functions and are presumably not directly linked to virulence and host-specificity. The discriminatory properties of the commonly used diagnostic markers of G. duodenalis have not been investigated systematically. Here, different molecular typing methods were used to address the discriminatory properties of SSU-rDNA, BG, GDH and TPI. Typing at the level of assemblages is relatively straightforward, and can be achieved with all four markers. Importantly, assemblages C, D, E and F are found in rare human cases (0.8% of human cases). These findings demonstrate that G. duodenalis assemblages C to F can indeed infect humans. Since human infection with these assemblages occurs infrequently, it seems that the host-range of G. duodenalis may be determined by more factors than the host-parasite interaction alone. It is also unclear whether human infections with assemblages C to F result in disease. Typing at the level of sub-assemblages was only possible for BG, GDH and TPI, but not for SSU-rDNA, because the SSU-rDNA locus showed too little intra-assemblage variability in both assemblage A and B (see also [22]).

Assemblage A

Significant differences were found between the sub-assemblages AI, AII and AIII. Although sub-assemblages AI and AII are found in both humans and animals, sub-assemblage AI is preferentially found in livestock and pets whereas sub-assemblage AII is predominantly found in humans. Sub-assemblage AIII is almost exclusively found in wild hoofed animals, and is most likely a host-adapted genotype. Several potential zoonotic subtypes, which correspond to the majority of the isolates, were identified at the level of individual markers (Table 10). However, combining the subtype information of the available markers of individual isolates (MLG) resulted in only two potentially zoonotic genotypes within assemblage A. Thus, the most important conclusion is that analysis of single markers is inaccurate for molecular epidemiological studies. This finding is consistent with the phylogenetic analysis of assemblage A: the genetic variation found in individual markers is too low to allow discrimination of different genotypes (Figure 1). Conversely, many subtypes for assemblage A were identified for each marker (Table 10: 15 for SSU-rDNA, 80 for BG, 40 for GDH, and 42 for TPI). Subtyping is based on similarity, and a single point mutation has been considered sufficient to describe a new subtype. For all markers it was found that only a minority of subtypes corresponded to the majority of isolates and that the majority of subtypes were found in only one or two isolates. Whether all these subtypes correspond to new genotypes or whether some of them will turn out to be (sequence) artifacts is unclear. The significance of all these subtypes will become clearer when more molecular epidemiological data are added to the database. From the six MLGs defined within assemblage A, two are potentially zoonotic. Genotype AI-3 consisted mostly of animal isolates and of a few human (axenic) isolates, whereas AII-2 consisted predominantly of human isolates and a single cat isolate. These findings are in agreement with the preferential distribution of AI and AII found at the level of sub-assemblages. Since the number of MLG isolates is relatively small, especially for pet isolates typed with three (consistent) markers, more genotypes with zoonotic potential may exist. The assumption is that genetically identical G. duodenalis found in both humans and animals, are zoonotic. Remarkably, the isolates having zoonotic potential were not epidemiologically linked (i.e. same location, same study). These findings highlight the global distribution of these G. duodenalis genotypes, but provide little evidence for zoonotic transmission.

Assemblage B

The host distribution of assemblage B is predominantly human and to a much lesser extent wildlife and dog (Table 2). Assemblage B is also found regularly in (captive) non-human primates. They generally do not play significant roles in the life cycle of G. duodenalis, which involve humans. The abundance of assemblage B in (captive) non-human primates may be due exposure to human sources. Alternatively, assemblage B is well-adapted to infect primates. Genotyping of assemblage B was more problematic. The genetic diversity (number of subtypes) and the percentage of sequences with mixed templates (ambiguous nucleotides) were ~2,5 and 4 times higher than for assemblage A, respectively. The mixing of the sub-assemblages BIII and BIV within isolates was ~30%, which is 6 times more than the mixing observed between sub-assemblages AI and AII. Furthermore, the 119 field isolates of assemblage B with 3 markers (BG, GDH, TPI) consisted of 102 humans, 13 primates, 2 zoo animals, one guinea pig and one rabbit. Relevant animal sources, in particular dogs and marine animals [38] are present in the database, but are not typed with the 3 markers of assemblage B. Together, these factors hamper the precise assignment of isolates at the assemblage or subtype level. Typing with two but not with three markers resulted in the identification of a few potentially zoonotic MLGs. Alternative approaches, e.g. removal of ambiguous nucleotides or estimation of potential zoonotic MLGs by combining the zoonotic information from individual markers, resulted in the identification of potential zoonotic genotypes, which corresponded to only 4–7% of the isolates of assemblage B. All in all, no clear genotypes could be inferred for assemblage B, and no distinction between zoonotic and host-adapted genotypes could be made within assemblage B.

Mixed infections and allelic sequence heterozygosity

Two principal mechanisms can explain the occurrence of ambiguous nucleotides and the inconsistent assignment of single isolates at the level of both assemblage and sub-assemblage: (i) “true” mixed infections; and ii) allelic sequence heterozygosity (ASH). The presence of more than one G. duodenalis type during a symptomatic infection has important implications for the etiology of giardiasis: it is unclear how humans and animals become infected with two or more G. duodenalis types. Subjects may be infected simultaneously with different Giardia assemblages (or even subtypes), because of environmental mixing, for example in water. Alternatively, subjects are asymptomatically infected with one Giardia assemblage, but become ill/symptomatic from a second infection with another Giardia assemblage. The latter hypothesis is supported by the finding of asymptomatic subjects [6],[7],[12],[14]. The occurrence of mixed infections has important epidemiological implications. Using only one marker for the assignment of isolates to specific (sub)-assemblages is not always reliable, as different markers can give different results. For example, isolates can be typed as “potentially zoonotic” with one marker, but as “host-adapted” with another. More reliable results are obtained when multiple markers are used for typing. On the other hand, “true” G. duodenalis genotypes are difficult to identify in mixed infections.

Allelic sequence heterozygosity (ASH) is not unusual for diplomonads, which have two diploid nuclei, and replicate asexually [1]. Indeed, in asexual eukaryotes, the two allelic gene copies at a locus are expected to become highly divergent as a result of the independent accumulation of mutations in the absence of segregation (Meselson's effect). Therefore, substantial genetic differences are expected to accumulate among the chromosome homologues in asexual organisms with a ploidy of two or higher [44]. However, the ASH found in the genome of G. duodenalis assemblage A is extremely low [43], but the mechanism(s) responsible remained undetermined. Based on the presence of ambiguous nucleotides in sequences derived from PCR products, it is to be expected that the ASH is higher in assemblages B, C, and D than in assemblages A, E and F (Table 11). Recent studies have shown that G.duodenalis may be able to undergo sexual reproduction, a phenomenon that can influence ASH levels [35],[45]. However, the frequency of recombination is not known, nor its impact on the etiology and epidemiology of giardiasis [11],[23].

Future directions

The ZOOPNET-database is the largest molecular epidemiological database of G. duodenalis to date. Still, the limitations of this unique database are apparent. Currently, the database contains a heterogeneous geographic- and incomplete source distribution of a “limited” set of isolates. Furthermore, each isolate is characterized by a small set of epidemiological data and limited sequence data. Our aim is to expand and improve the ZOOPNET database: since the content of the ZOOPNET database is accessible via internet, scientists can use these data for their own epidemiological studies. The web-based ZOOPNET-database will remain accessible, and its interface will be soon improved. Both veterinary and public health researchers are welcome to submit their molecular epidemiological data on G. duodenalis and Cryptosporidium to ZOOPNET. The web-based ZOOPNET-database has a flexible content and provides a powerful tool for new (inter)national studies on giardiasis (and cryptosporidiosis).