Abstract

Mycobacterium avium subspecies paratuberculosis (M. ap), the causative agent of Johne's disease, infects many farmed ruminants, wild-life animals, and recently isolated from humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole-genome sequences of several M. ap and M. avium subspecies avium (M. avium) isolates to gain insights into genomic diversity associated with variable hosts and environments. Using Next-generation sequencing technology, all six M. ap isolates showed a high percentage of similarity (98%) to the reference genome sequence of M. ap K-10 isolated from cattle. However, two M. avium isolates (DT 78 and Env 77) showed significant sequence diversity (only 87 and 40% similarity, respectively) compared to the reference strain M. avium 104, a reflection of the wide environmental niches of this group of mycobacteria. Within the M. ap isolates, genomic rearrangements (insertions/deletions) were not detected, and only unique single nucleotide polymorphisms (SNPs) were observed among M. ap isolates. While more of the SNPs (~100) in M. ap genomes were non-synonymous, a total of ~6,000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomics had a enough discriminatory power to differentiate between isolates from different hosts but yet suggesting a bovine source of infection to other animals examined in this study. Interestingly, the human isolate (M. ap 4B) was closely related to a M. ap isolate from a dairy facility, suggesting a common source of infection. Overall, the identified phylo-genomes further supported the idea of a common ancestor to both M. ap and M. avium isolates. Genome-wide analysis described here could provide a strong foundation for a population genetic structure that could be useful for the analysis of mycobacterial evolution and for the tracking of Johne's disease transmission among animals.

A whole-genome alignment of M. avium DT 78, M. avium 104 and M. ap DT 78. MAUVE algorithm (Darling et al., ) was used for the alignment of the three genomes where white areas indicate low coverage gaps in the sequence of M. avium DT 78 genome, and about seven large region Indels were identified in M. avium DT 78. Regions with the same color indicate high similarity and connected by same color bars. The genomes were drawn to scale based on the reference M. avium 104 genome.

Genome composition of M. avium Env 77. MegaBLAST algorithm was used to identify closely related bacteria to all contig sequences from the M. avium Env 77 isolate. Genomes with <10% homology were excluded from representation. Members of the M. tuberculosis complex included M. tuberculosis and M. bovis with sequence divergence <5%. The same criteria was used to formulate M. avium and M. ap groups.

Comparative analysis of M. ap and M. avium from animals and environmental sources. The gapped consensus sequence of each strain was used for comparison by MAUVE version 2.3.1. (A) A close-up depiction of a breaking point in the alignment of six M. ap genomes in comparison to M. ap K-10 reference genome. The white areas indicated low or zero reads. In this example, the flanking sequences of the breaking point contain high GC percentage sequence but not repetitive sequences. (B) Indels among M. ap and M. avium genomes. Notice genome rearrangements are usually surrounding the genome origin of replication.

The total number of single nucleotide polymorphism (SNP) among M. ap isolates. The number of nSNP (non-synonymous) and sSNP (synonymous) and SNPs in the intergenic regions are color coded as indicated. SNPs were detected using reference assembled sequences of each strain. About 60–130 SNPs were detected M. ap isolates. Percentage of nSNP is generally higher than sSNP which indicates a high selective pressure in these strains.

Phylogenomic analysis of M. ap and M. avium strains. (A) A dendrogram displaying an un-rooted, Neighbor-joining tree of the concatenated SNPs from all eight mycobacterial isolates under study. (B) A rooted Neighbor-joining tree using M. ah 104 genome as out group. The bootstrap consensus tree inferred from 1,000 replicates is taken to represent the evolutionary history of the taxa analyzed. The bootstrap replicates are marked on each branch and a less than 50% bootstrap replicates were collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test is shown next to the branches.