A single nucleotide substitution within the coding region of ALD5/YER073W resulted in an altered protein sequence. The start, stop, and reading frame remain the same, but protein residue 411 is now Glutamic Acid rather than Glycine.

Nucleotide changes within the coding region of CEM1/YER061C resulted in an altered protein sequence. The start, stop, and reading frame remain the same, but protein residue 367 is now Alanine rather than Arginine.

A single nucleotide substitution within the coding region of YEN1/YER041W resulted in an altered protein sequence. The start, stop, and reading frame remain the same, but protein residue 59 is now Alanine rather than Proline.

A single nucleotide substitution within the coding region of RAD4/YER162C resulted in an altered protein sequence. The start, stop, and reading frame remain the same, but protein residue 223 is now Valine rather than Glutamic Acid.

Nucleotide substitutions within the coding region of PTP3/YER075C resulted in an altered protein sequence. The start, stop, and reading frame remain the same, but protein residue 717 is now Alanine rather than Proline, and residue 738 is now Lysine rather than Q, and residue 857 is now Glutamine rather than Glutamic Acid.

The start of GIM4/YEL003W was moved 52 nt upstream, and an intron was added at relative coordinates 20-107, based on GenBank EF123144, Juneau et al. 2007, and Miura et al. 2006. According to Juneau et al. 2007, the intron is "inefficiently spliced" (splicing rate = 72%).The old coding coordinates were 148227..148598 (372 nt, 123 aa), and the new coding coordinates are 148175..148193,148282..148598 (1..19,108..424; 111 aa).

The proposal by Kellis et al. was re-examined in light of sequence data from S. kudriavzevii (another sensu stricto strain published by Cliften et al.). The S. kudriavzevii sequence supported the start codon suggested by Kellis et al., so the start site for UTR4/YEL038W be moved 42 nt (14 codons) downstream.

The previously annotated boundaries of CEN5 were adjusted to coincide with the 5' end of CDEI and the 3' end of CDEIII, to more accurately reflect current knowledge regarding centromere structure in Saccharomyces cerevisiae.

The start site of YER030W is being moved 21 bp downstream from 213415 to 213436 because the 5' SAGE data used by Zhang & Dietrich 2005 to study transcription start sites confirmed the initial suggestion by Kellis et al. 2003 that this change be made. The size of the predicted protein is reduced from 160 aa to 153 aa.

This non-coding RNA feature was annotated based on information from Fred Winston; the SRG1 TATA begins at position 322124, the transcription start sites are at positions 322208 and 322209, and the size of the transcript is approximately 550 bases as determined by Northern analysis.

Based on the automated comparison of closely-related Saccharomyces species by Kellis et al., the start site for PDA1/YER178W was moved 69 nt (23 codons) downstream. Evidence supporting this change includes: 1) This is the predicted start methionine in the majority of Saccharomyces species orthologs analyzed by Kellis et al. and/or Cliften et al.; 2) Significant sequence conservation begins abruptly at this predicted start methionine; 3) The predicted protein translated from the conserved methionine contains a predicted mitochondrial targeting signal sequence (using both MitoProt and Predotar), while the predicted protein translated from the currently annotated S. cerevisiae start codon does not.

Based on the automated comparison of closely-related Saccharomyces species by Kellis et al., the start site for FIR1/YER032W was moved 147 nt (49 codons) downstream. Evidence supporting this change includes: 1) This is the predicted start methionine in the majority of Saccharomyces species orthologs analyzed by Kellis et al. and/or Cliften et al.; 2) Significant sequence conservation begins abruptly at this predicted start methionine.

Based on the automated comparison of closely-related Saccharomyces species by Kellis et al., the start site for NPR2/YEL062W was moved 27 nt (9 codons) downstream. Evidence supporting this change includes: 1) This is the predicted start methionine in the majority of Saccharomyces species orthologs analyzed by Kellis et al. and/or Cliften et al.; 2) Significant sequence conservation begins abruptly at this predicted start methionine.

Based on the automated comparison of closely-related Saccharomyces species by Kellis et al., the start site for CIN8/YEL061C was moved 114 nt (38 codons) downstream. Evidence supporting this change includes: 1) This is the predicted start methionine in the majority of Saccharomyces species orthologs analyzed by Kellis et al. and/or Cliften et al.; 2) Significant sequence conservation begins abruptly at this predicted start methionine.

Based on the automated comparison of closely-related Saccharomyces species by Kellis et al., the start site for YER083C was moved 66 nt (22 codons) downstream. Evidence supporting this change includes: 1) This is the predicted start methionine in the majority of Saccharomyces species orthologs analyzed by Kellis et al. and/or Cliften et al.; 2) Significant sequence conservation begins abruptly at this predicted start methionine.

Note that both TEL05L and TEL05R have telomeric repeats (TEL05L-TR and TEL05R-TR), but they are missing from the genome annotation due to sequencing difficulties encountered during the initial genome sequencing efforts in the 1990s.

The YERCsigma4 element was initially mistakenly annotated as a separate sigma LTR, though its coordinates completely overlapped with the full length transposon YERCTy1-2, which contains delta elements, not sigma elements. Thus, YERCsigma4 has been deleted from the genome annotation.

The YERWdelta18 element was initially mistakenly annotated as a separate LTR, though its coordinates completely overlapped with the full length transposon YERCTy1-1. Thus, YERWdelta18 has been deleted from the database.

The intron of YER056C-A was moved 2 nucleotides upstream. The genomic sequence remains unchanged, but the coding sequence is now only very slightly altered. Relative coordinates change from 1-39..437-763 to 1-37..435-780, and chromosomal coordinates change from 270183-270145..269747-269421 to 270183-270147..269749-269421.

The start site of YEL012W was moved 159 nucleotides upstream, and an intron was added at relative coordinates 6-128. The stop remains unchanged. Relative coordinates change from 1-621 to 1-5..129-780, and chromosomal coordinates change from 131931-132551 to 131772-131776..131900-132551.

YER108C and YER109C were originally annotated as two separate open reading frames, but it has been demonstrated that they correspond to the FLO8 gene, which contains a nonsense mutation in the reference strain S288C - an A to G transition at position 431, changing amino acid 144 from a Trp to a stop. Therefore, they have been fused into one reading frame with an internal stop codon.

YER060W-A/FCY22 was originally incorrectly annotated as being identical to its neighboring ORF YER060W/FCY21, at coordinates 274565-276151 (1587 nucleotides long). This error has been corrected, and the coordinates of YER060W-A/FCY22 are now 276570-278162 (1593 nt). Sequence files have been updated accordingly.

The coordinates of the tag sequences along the genome were determined and each tag was classified into one of these four categories: 1) class 1 - within an existing ORF, 2) class 2 - within 500 bp downstream of existing an ORF, 3) class 4 - opposite of an existing ORF, or 4) class 3 - none of the above. The regions between two existing ORFs which contained one or more unique class 3 tags (number 4) above) were examined for potential coding sequences in which the unique tag was located either within the coding sequence or 500bp downstream of this sequence. BLASTP analysis was then performed for each potential ORF meeting these criteria against the non-redundant (nr) NCBI dataset, and those with a P value exponent of -6 or less were analyzed further. The BLAST results were analyzed on an individual basis for each potential ORF meeting the above criteria. Those potential ORFs which exhibited reasonable homology to other proteins, and did not appear to be matched with other proteins based on homology to repetitive sequences alone, were identified and entered into SGD.