1a.Objectives (from AD-416):
1. Use state of the art genomic tools to reveal the key genetic changes behind seventy years of soybean improvement.
2. Evaluate key soybean genotypes developed over seventy years using modern agronomic practices.
3. Reveal the changes in patterns of gene expression in soybean with an emphasis on yield and stress.
4. Reveal patterns of methylation and histone modification in the soybean genome.
5. Identify and evaluate transcription factors and small RNAs in the soybean transcriptome.

1b.Approach (from AD-416):
State of the art genomic tools will be used to improve breeding strategies for soybean improvement. Next-generation sequencing technologies will be used to 're-sequence' the genome of landrace ancestors contributing to soybean germplasm, milestone cultivars representing 70 years of incremental increases in genetic yield potential, and 40 parents used in development of a Nested Association Mapping (NAM) population. The genomic sequence will be overlaid onto the whole-genome sequence of Williams 82. Chromosomal segments and allele combinations will be identified that have been selected for over decades of breeding. These breeder 'signatures' will tell us what we did that was 'right' and what we changed in the genome to achieve yield improvement. Selected NAM parents and progeny extremes (high yield vs. low yield) will be selected and changes in transcriptomes will be evaluated in an attempt to identify metabolic pathways contributing to 'yield'. The same lines will be evaluated for epigenetic changes in expression by mapping their methylome. Data will be analyzed and entered into SoyBase for public distribution. Personnel on the project will coordinate with the Department of Energy-Joint Genome Initiative in the development of an in-depth gene expression atlas.

3.Progress Report:

Preliminary sequence has been generated for three genotypes for analyses to assess DNA quality, sequence success, and depth of coverage. Analyses of this sequence indicated that the project was ready to proceed forward. DNA from forty genotypes has now been provided to Hudson Alpha and is in the process of being sequenced. DNA from the remaining genotypes to be sequenced has been extracted and purified or is in the process of being purified. Using the sequences provided for the three test genotypes, we developed a pipeline for analyzing genome sequences generated by high throughput sequencing technologies. This pipeline includes methodologies for analyzing sequence quality, aligning sequences relative to the Williams 82 genome sequence and calling Single Nucleotide Polymorphisms (SNPS). Thus far, we have analyzed over 100 million sequence reads from the cultivars Iowa3023, Richland and Mandarin, collectively. Of these, ~95% could be aligned relative to Williams 82. In a first pass, over 600,000 potential SNPS were called from each genotype; however, the quality of the SNPs still must be evaluated and confirmed. In addition, we are experimenting with different visualization platforms for making comparisons across cultivars. This project will permit scientists to retrospectively determine what regions of the genome and what combinations of regions breeders selected for that resulted in improved cultivars.