A single G nucleotide was inserted within the ORF YLR407W near its 3' end, altering its coding sequence. The start and vast majority of the reading frame remain the same, but the C-terminus has changed and the annotated protein is now one amino acid shorter.

A single G nucleotide was inserted within the ORF YLR402W near its 5' end, necessitating a change in annotation to a downstream start codon. The sequence change makes the longer version of the protein no longer possible in the reference sequence. The stop and reading frame remain the same, but the annotated protein is now 108 amino acids shorter, less than half its original size.

A single nucleotide was inserted very near the 3' end of ORF YLL054C, altering its coding sequence. The start and most of the reading frame remain the same, but the C-terminus has changed and the annotated protein is now 74 amino acids longer.

A GG dinucleotide was inserted within the ORF SPH1/YLR313C near its 3' end, altering its coding sequence. The start and majority of the reading frame remain the same, but the C-terminus has changed and the annotated protein is now 131 amino acids shorter.

Nucleotide change(s) in the coding region of CDC3/YLR314C resulted in an altered protein sequence. The start, stop, and reading frame remain the same, but protein residue 431 is now Glutamine rather than Leucine.

Nucleotide change(s) in the coding region of EST2/YLR318W resulted in an altered protein sequence. The start, stop, and reading frame remain the same, but protein residue 162 is now Alanine rather than Valine.

Based on the automated comparison of related fungi, Kellis et al. and Brachat et al. suggest that the stop site for YLR401C be moved 176 nt downstream. SGD confirmed the insertion of a single G nt. and have updated the systematic sequence accordingly. As a consequence of this sequence change, DUS3/YLR401C was extended on the 3' end, altering the C-terminus and increasing the size of the predicted protein from 609 to 668 amino acids

Brachat et al. 2003 predicted and confirmed the insertion of two nucleotides in Chromosome XII - a G after the G at 553548, and a C after the C at 553633. As a consequence of these sequence changes, HMX1/YLR205C was extended at the 5' end, increasing the size of the protein from 273 amino acids to 317 amino acids.

Brachat et al. 2003 predicted and confirmed the insertion of a single T nucleotide in YLR389C. As a consequence of this sequence change, YLR389C was extended at the 3' end, increasing the size of the predicted protein from 988 amino acids to 1027 amino acids.

Roemer et al confirmed the deletion of two C nucleotides in SPH1/YLR313C in an S288C strain background. As a consequence of this sequence change, SPH1/YLR313C was extended at the 3' and merged with an adjacent ORF, YLR312C-B, that was in the same frame. This change increased the size of the Sph1p from 530 amino acids to 661 amino acids. Roemer et al also confirmed other laboratory strain backgrounds, such as W303-derived strains, do have the additional two nucleotides and encode the shorter protein.

A single T nucleotide was inserted before the T at chromosomal coordinate 199680 within the ORF ADE16/YLR028C. At the same time this sequence change was made, the stop site for YLR028C was moved downstream 110 nucleotides.

The chromosomal coordinates of the following ARS elements on Chromosome XII were updated based on Liachko et al. 2013 as part of SGD's genome annotation revision R64.2: ARS1212, ARS1213, ARS1215, ARS1218, ARS1220, ARS1226, ARS1234.

YLRCdelta24, a Ty1 LTR on Chromosome XII, was mistakenly annotated on the wrong strand (i.e., on Crick instead of Watson). Both the orientation and the feature name have been corrected, so that the LTR now has the systematic name YLRWdelta24 and is annotated on the Watson strand. The name YLRCdelta24 is being retained as an alias.

Based on N-terminal sequencing by Kitakawa et al., the intron and first exon have been removed from the annotation of MRPL15/YLR312W-A. The newly annotated start codon is located 97 nt downstream of the originally annotated start, within what was previously annotated as intron sequence. Special Thanks to Ivo Pedruzzi and the team at Swiss-Prot for bringing this change to our attention.

Based on analyses of homology and synteny in Ashbya gossypii, Brachat et al. proposed an intron and 3' extension for YLR445W. The resulting ORF is in the same frame with the stop codon shifted 244 bp downstream; the protein has a 54-residue extension at the C-terminus.

Based on the alignment of orthologs in related fungi, Cliften et al. and Brachat et al. both proposed an intron and new 5' exon for YLR054C. The resulting ORF is in the same frame, but has a 212-residue extension at the N-terminus.

Based on the alignment of orthologs in related Saccharomyces species, Cliften et al. 2003 proposed an intron and new 5' exon for YLR199C. The resulting ORF is in the same frame, but with the start moved upstream 254 bp. The old coding sequence was 663 nt long, and is now 831 nt long. This change was reviewed and accepted by SGD curators.

The start site for AAT2/YLR027C was moved 42 nt (14 codons) downstream based on automated comparison of closely related Saccharomyces species. Evidence supporting this change includes: 1) This is the predicted start methionine in the majority of Saccharomyces species orthologs analyzed by Kellis et al. and/or Cliften et al.; 2) Significant sequence conservation begins abruptly at this predicted start methionine; 3) The first ATG is not conserved in the related species; 4) Sequencing of the entire protein reported the following sequence in the amino terminus: SATLFNNIELL (Cronin VB, et al.).

Based on the automated comparison of closely related Saccharomyces species, Kellis et al. suggest that the start site for TSR2/YLR435W be moved 132 nt (44 codons) downstream. This suggestion was reviewed and accepted by SGD curators. The numbering for both the nucleotides in the DNA coding sequence and the amino acids in the predicted protein have been changed accordingly. Evidence supporting this change includes: 1) This is the predicted start methionine in the majority of Saccharomyces species orthologs analyzed by Kellis et al. and/or Cliften et al.; 2) Significant sequence conservation begins abruptly at this predicted start methionine; 3) Although S. paradoxus, S. mikatae, and S. bayanus have upstream ATGs, there are insertions and deletions which cause frame shifts in between the first and second ATGs; 4) Protein sequence comparison with the nr dataset show there are no sequence similarities between the first and second ATG. All hits occur after the 2nd ATG.

The automated comparison of closely related Saccharomyces species suggests that the start site for YLR012C be moved 69 nt (23 codons) downstream. This suggestion was reviewed and accepted by SGD curators. The numbering for both the nucleotides in the DNA coding sequence and the amino acids in the predicted protein have been changed accordingly. Evidence supporting this change includes: 1) This is the predicted start methionine in the majority of Saccharomyces species orthologs analyzed by Kellis et al. and/or Cliften et al.; 2) Significant sequence conservation begins abruptly at this predicted start methionine; 3) The first ATG and the translation frame are not conserved in the related species.

The start site of YLR316C was moved 601 nucleotides upstream, and at the same time two separate introns were added at new relative coordinates 110-177 and 230-285. The chromosomal coordinates for the coding region have changed from 765755-765264 to 766357-766249..766180-766129..766072-765265.

The start site of YLR093C was moved 147 nucleotides upstream, and at the same time an intron was added at relative coordinates 17-157. The chromosomal coordinates for the coding region have changed from 327268-326513 to 327416-327401..327259-326514.

The start site of YLR211C was moved 317 nucleotides upstream, and at the same time an intron was added at relative coordinates 19-77. The old chromosomal coordinates of the coding sequence were 564213-563791, and the new chromosomal coordinates of the coding sequence are now 564531-564514..564454-563792.

The coordinates of the tag sequences along the genome were determined and each tag was classified into one of these four categories: 1) class 1 - within an existing ORF, 2) class 2 - within 500 bp downstream of existing an ORF, 3) class 4 - opposite of an existing ORF, or 4) class 3 - none of the above. The regions between two existing ORFs which contained one or more unique class 3 tags (number 4) above) were examined for potential coding sequences in which the unique tag was located either within the coding sequence or 500bp downstream of this sequence. BLASTP analysis was then performed for each potential ORF meeting these criteria against the non-redundant (nr) NCBI dataset, and those with a P value exponent of -6 or less were analyzed further. The BLAST results were analyzed on an individual basis for each potential ORF meeting the above criteria. Those potential ORFs which exhibited reasonable homology to other proteins, and did not appear to be matched with other proteins based on homology to repetitive sequences alone, were identified and entered into SGD.

YLR391W was an ORF that was named in the original sequence of Chromosome XII. Since it was completely contained in a larger ORF with a better codon bias, it was deleted from SGD and GenBank in favor of the larger ORF, YLR391W-A.