Random DNA sequencing

Template preparation. Bacterial
colonies containing plasmids from the random shotgun library are plated
onto selective media directly from a library transformation. Use of fresh
colonies is important for achieving high quality plasmid DNA sequencing
templates.

Automated cycle sequencing.The ABI Catalyst robot 800 is a sophisticated pipetting and temperature
control robot that has been developed specifically for DNA sequencing
reactions. Sequencing from small-insert, and lambda libraries will proceed
in parallel. Approximately 35,200 good sequences of the small insert library
are required to achieve 8x coverage. To achieve an 8x coverage with the
Lambda clones ~ 1000 lambda clones will be prepared and sequenced from
both ends. End sequences of these lambda clones will be used to define
the position of each lambda on the genome map, thus provide a confirmation
of the structure of the assembly.

Assembly

Assembly of the genome sequence will be performed using TIGR Assembler
(Sutton et al., 1995) which simultaneously clusters and assembles
fragments of the genome using a best-match-first strategy. Potentially
chimeric fragments and fragments representing the boundaries of repetitive
regions are flagged based on partial mismatches at the ends of alignments
and excluded from the contig. TIGR Assembler recognizes potentially repetitive
regions (those present at more than one copy in the genome) based on 10-mer
oligonucleotide frequency. Contig building in repetitive regions is more
stringent than in non-repetitive regions to attempt to distinguish among
closely related copies of the repeat element. TIGR Assembler is designed
to accommodate clone size information coupled with sequencing from both
ends of each template. This constraint demands that sequence fragments
from two ends of the same template point toward one another in the assembly
and are located within a certain number of base pairs (definable for each
clone). TIGR Assembler can deal very well with repetitive elements that
are either shorter than the average clone length in the library (generally
1.5-2.0 kbp) or are less than 97% identical. The accuracy of individual
sequence fragments generated at TIGR is high; over 75% of the sequences
generated in the M. jannaschii genome project have fewer than 1%
ambiguous base calls. This permits rigorous assembly criteria to be used
and reduces the impact of closely related repetitive elements.Assembly
across the ribosomal RNA operons or other repeat areas (IS elements) that
are longer than one kilobase may not be possible using TIGR Assembler
if the operons or the repeat structures are >97% similar to each other.
Each rRNA operon and repeat area will be sequenced independently by primer
walking directly on the appropriate lambda DNA template