SNP Discovery Using a Pangenome: Has the Single Reference Approach Become Obsolete

School of Agriculture and Food Sciences, University of Queensland, St. Lucia 4072, QLD, Australia

2

School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth 6009, WA, Australia

*

Author to whom correspondence should be addressed.

Academic Editor: Chris O’Callaghan

Abstract Increasing evidence suggests that a single individual is insufficient to capture the genetic diversity within a species due to gene presence absence variation. In order to understand the extent to which genomic variation occurs in a species, the construction of its pangenome is necessary. The pangenome represents the complete set of genes of a species; it is composed of core genes, which are present in all individuals, and variable genes, which are present only in some individuals. Aside from variations at the gene level, single nucleotide polymorphisms SNPs are also an important form of genetic variation. The advent of next-generation sequencing NGS coupled with the heritability of SNPs make them ideal markers for genetic analysis of human, animal, and microbial data. SNPs have also been extensively used in crop genetics for association mapping, quantitative trait loci QTL analysis, analysis of genetic diversity, and phylogenetic analysis. This review focuses on the use of pangenomes for SNP discovery. It highlights the advantages of using a pangenome rather than a single reference for this purpose. This review also demonstrates how extra information not captured in a single reference alone can be used to provide additional support for linking genotypic data to phenotypic data. View Full-Text