Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

2.
Salmonella is a leading public health concern
 Salmonella is a leading food-borne pathogen both in Canada and around the world
 Globally, there are an estimated 94 million Salmonella infections every year
 Human costs:
• acute illness
• loss of life (155,000 deaths)
 Societal costs:
• health care costs
• lost productivity
• legal costs
• impact to food industry
2

4.
Challenges in Salmonella typing and epidemiology
 Small number of highly prevalent/globally distributed serovars account for most
outbreaks (e.g. Enteritidis, Typhimurium)
 Epidemiologicaly unrelated isolates within same serovar  difficult to
investigate
 Additional subtyping resolution within a serovar needed (e.g. phage typing)
 Increasing use of genotypic methods (i.e. molecular typing)
 Driven by need for methods with higher discriminatory power
 A number of different approaches have been applied to molecular typing of
Salmonella
4

5.
5
GATCGATCGATCG
GATCAATCGATCG
MLST cgMLST wgSNP’sSerotyping
Discriminatory Power
Low Low-Mid Mid-High High
• Based on reaction
of antibodies to
surface antigens
• Broad usage and
common
nomenclature in
use since the
1930’s
• Multi-Locus Sequence Typing:
developed by Maiden et al. (1998)
• Indexes genetic variation in 7 core (i.e.
“housekeeping”) genes
• cgMLST extends this principle to 100’s
to 1000’s of loci
• Provides a portable naming scheme
which correlates with historical
serotypes
• Utilizes individual
SNP’s and gives
very high
resolution
• Results are not
portable to other
public health
professionals

9.
Testing the accuracy of SISTR
• ~45,000 Salmonella genomes were downloaded from the
SRA
• Raw reads were assembled using FLASH and Spades
• Assemblies were loaded into SISTR and the serovar
predictions were compared between predicted and
reported (where available)
• Assemblies were checked for contamination using Kraken
• Quality was assessed using Quast
10

10.
Recovery rates of 330 cgMLST genes from Assembled
SRA genomes
11
41781
1393
1905
Number of Genomes with
Complete 330
Number of Genomes with >300
Genes
Number of Genomes with <300
Genes
N=45,079

20.
Conclusions
• SISTR is a a robust and accurate platform for Salmonella in silico
typing with 93.7% concordance between specified serovar and
predicted serovar
• The prototype 330 gene cgMLST scheme is readily retrievable from
HTS assemblies of varying quality levels.
• The current scheme provides coarse grain separation of Salmonella
genetic lineages that will be useful in outbreak analysis
21