Data Usage Policy

The draft sequences of the whole genome and the BACs spanning the QTL regions of Castanea mollissima are being made available by the project investigators as a public service. As outlined in the Fort Lauderdale principles, the investigators request and expect to retain the right to the first publications and presentations of any global analysis of this data. The QTL sequencing, despite spanning a small percentage of the total genome, is considered a full project in and of itself, and as such, the first publications and presentations of this data is global in nature and is constrained under the same principles as the whole genome sequence. These restrictions will be lifted upon publication of the data by the investigators or after 12 months, whichever comes first. The data presented here is in a draft state, and the investigators make no guarantees as to its accuracy or completeness.

Questions or concerns may be directed to the project leader, Dr. John Carlson, through our contact form. Authors considering usage of the data in a publication are also requested to notify the project leader.

Background

This project was initiated and led by Dr. John Carlson at Pennsylvania State University and was primarily funded by the Forest Heath Initiative.
The Chinese chestnut is a member of the Fagales order which includes a number of other important hardwood tree species such as oak, walnut and beech. The Vanuxem genotype, provided by The American Chestnut Foundation, was sequenced with a next generation shotgun approach. This first draft of the genome assembly covers 724.4 Mbp of the estimated 800 Mb chestnut genome was obtained in 41,270 scaffolds. The three blight resistance QTL were sequenced to greater depth through next generation sequencing of pools of bacterial artificial chromosomes (BACs).

Blight resistance QTL Sequences (v1.0)

Sequence probes were designed from the three blight QTL regions on the consensus genetic map(Kubisiak et al., 2012) and used to probe the BAC libraries. Using the probe results and the physical map (Fang et al., 2012), 190 BACs spanning the QTL regions were selected for sequencing. 503Mb of 454 single ends reads and 3.8Gb of MiSeq paired end reads were generated. The assembly was created with the software Newbler (Margulies et al., 2005) and additional scaffolding was completed with SSPACE (Boetzer et al., 2011). The gene prediction and annotation were completed with GMOD’s software tool Maker (Holt and Yandell, 2011) using the chestnut transcriptome sequences, Prunus persica protein sequences and Arabidopsis thaliana protein sequences.