Search form

You are here

Datasets

Preparation of EST data: Sequences were extracted from dbEST and were subjected to quality control screening (vector, E. coli, polyA, T, or CT removal, minimum length = 100 bp, &lt; 3% N). Preparation of transcript (ET) database: All sequences from the appropriate divisions of GenBank (including RefSeq) were extracted. Non-coding sequences were discarded and cDNAs and coding sequences from genomic entries were saved. Sequences and related information (e.g. PubMed links) are stored in the qcGene database (qcGene). Assembly: Cleaned EST sequences and non-redundant transcript (ET) sequences were combined. Using the Paracel Transcript Assembler Program, sequences were assembled into contigs. TCs are consensus sequences based on two or more ESTs (and possibly an ET) that overlap for at least 40 bases with at least 94% sequence identity. These strict criteria help minimize the creation of chimeric contigs. These contigs are assigned a TC (Tentative Consensus) number. TCs may comprise ESTs derived from different tissues. The best hits for TC's were assigned by searching the TC set against a non-redundant amino acid database(nraa) using BLAT. The top five hits based on score were selected and displayed for each TC. Caveats: TCs are only as good as the ESTs underlying them; there may be unspliced or chimeric ESTs and thus TCs. There is still redundancy in the TC set because sequences must match end to end and at a certain percent identity to be combined. Directionality of the TCs should not be assumed. Not all TCs contain protein-coding regions.

Preparation of EST data: Sequences were extracted from dbEST and were subjected to quality control screening (vector, E. coli, polyA, T, or CT removal, minimum length = 100 bp, &lt; 3% N). Preparation of transcript (ET) database: All sequences from the appropriate divisions of GenBank (including RefSeq) were extracted. Non-coding sequences were discarded and cDNAs and coding sequences from genomic entries were saved. Sequences and related information (e.g. PubMed links) are stored in the qcGene database (qcGene). Assembly: Cleaned EST sequences and non-redundant transcript (ET) sequences were combined. Using the Paracel Transcript Assembler Program, sequences were assembled into contigs. TCs are consensus sequences based on two or more ESTs (and possibly an ET) that overlap for at least 40 bases with at least 94% sequence identity. These strict criteria help minimize the creation of chimeric contigs. These contigs are assigned a TC (Tentative Consensus) number. TCs may comprise ESTs derived from different tissues. The best hits for TC's were assigned by searching the TC set against a non-redundant amino acid database(nraa) using BLAT. The top five hits based on score were selected and displayed for each TC. Caveats: TCs are only as good as the ESTs underlying them; there may be unspliced or chimeric ESTs and thus TCs. There is still redundancy in the TC set because sequences must match end to end and at a certain percent identity to be combined. Directionality of the TCs should not be assumed. Not all TCs contain protein-coding regions.

A set of 1107 legume cross species orthologous sequences (COS) were amplified from Lens culinaris (CDC Redberry and Eston) and L. ervoides (L01-827a and IG 72815). Sequences were aligned and SNPs identified. A subset of 110 KASP assays were designed for use in L. culinaris. An Illumina GoldenGate array of 768 SNPs was designed for use in L. ervoides or interspecies hybrid populations between Lc and Le.

Mixture of eight cultivars with varying seed phenotypes: Indian Head, Commando, CDC LeMay, CDC Robin, and breeding lines 1899T-50 and 1788-4 (CDC, Univ. Saskatchewan, Saskatoon, Canada) All developmental stages of seeds and very young fertilized pods were harvested from mature plants, and divided into the following lots: very young fertilized ovaries, young ovules, enlarging seeds, cotyledons of fully filled seed, seed coats of fully filled seeds. cDNA library was made from a mixture of equal amounts of mRNA extracted from each of the above tissues.