Technical Abstract:
Cow genome sequencing is underway at Baylor College of Medicine (BCM) sequencing center and will be completed in the next few months. The bovine genome sequencing white paper indicated a goal to identify 100,000 SNP for use in identification and mapping of quantitative trait loci (QTL) regions. The SNP discovery efforts by the consortia are currently randomized in the assembly and are optimized only for maximizing the experimental SNP confirmation. As compared to the human SNP consortium [2] that performed large-scale high density coverage, the funding resources for the bovine genome will be limited and hence the SNP discovery efforts have to be optimized to maximize the benefits. In this abstract we summarize our SNP discovery optimization schema to identify among the high quality SNP in the population that has possible effects on regulating the gene function or altering the protein structure. Initially, a genome-wide scan of bovine SNP is performed. SNP validation efforts will be focused in the QTL regions and only for the SNP that can affect either gene function or protein structure. All the bovine sequences along with the Phred quality scores from WGS, BAC ends and many EST are available in the trace archive. The preliminary bovine genome assembly is used as an anchor to build an assembly from the individual traces that have significant match to the assembly. Similar to the neighborhood quality standard (NQS) method, all the variations observed at poor quality bases or in poor quality neighborhoods are eliminated. To account for the SNP observed due to paralogs and related gene families, all the SNP occurring within a breed are also eliminated in this analysis. The remaining SNP then are categorized based on the assembly annotation as rSNP – Regulatory SNP (promoter / enhancer elements), cSNP – non-synonymous SNP (amino acid change), sSNP – synonymous SNP (no amino acid change), tSNP – creates a stop codon in the reading frame, iSNP – Intron SNP or gSNP – SNP in the genomic region. The SNP discovery efforts are further focused to identify SNP implicated in the previously known QTL regions to identify the quantitative trait nucleotide (QTN) candidates among the rSNP/tSNP/cSNP. These predicted SNP are then experimentally tested in the Beltsville Agricultural Research Center (BARC) Dairy Cattle Diversity Panel to verify the predicted SNP and then can be used to fine map QTL regions. We have applied this methodology so far to identify and confirm SNP in neutrophil genes that were differentially expressed at parturition.