Abstract

Tetrahymena thermophila is the best studied of the ciliates, a diversified and successful lineage of eukaryotic protists. Mirroring the way in which many metazoans partition their germ line and soma into distinct cell types, ciliates separate germ line and soma into two distinct nuclei in a single cell. The diploid, transcriptionally silent micronucleus undergoes meiosis and fertilization during sexual reproduction and determines the genotype of the progeny; in contrast, the expressed macronucleus contains many copies of hundreds of small chromosomes, determines the cell's phenotype, and is inherited only through vegetative reproduction. Here we demonstrate the power of HAPPY physical mapping to aid the complete assembly of T. thermophila macronuclear chromosomes from shotgun sequence scaffolds. The finished genome, one of only two ciliate genomes shotgun sequenced, will shed valuable additional light upon the biology of this extraordinary, diverse, and, from a genomics standpoint, as yet largely unexplored evolutionary branch of eukaryotes.

The genome of Tetrahymena thermophila and strategy for the assembly of the macronuclear genome. (a) The micronuclear genome consists of large chromosomes (only part of one chromosome is shown). Copies of these chromosomes are cut at CBSs (blue), and IESs (red) are spliced out, to produce the smaller macronuclear chromosomes (b), which are capped with telomeres (hatched). (c) Shotgun sequencing of the macronuclear chromosomes yields many sequence contigs, some of which may consist of—or be prematurely truncated by the incorporation of—IESs from contaminating micronuclear DNA (red vertical lines). (d) Contigs can be linked to form scaffolds using read-pair information (heavy black lines). Sequences close to nontelomeric scaffold ends (filled triangles) plus some from within larger scaffolds (open triangles) are chosen for use as markers. (e) HAPPY mapping reveals physical links between markers, allowing the scaffolds to be grouped into superscaffolds (hoops indicate HAPPY links) and also confirming the correct assembly within some larger scaffolds.

Actual and predicted results from HAPPY mapping. For numbers of mapped markers between 100 and 520, the number of interscaffold HAPPY linkages predicted by simulation (darker gray bars; each value is the mean of 10 simulations) is compared with the number of linkages observed (lighter gray bars).

Plot of θ (inversely related to linkage) as a function of base pair distance for intrascaffold marker pairs. The scatter results from a combination of sampling error, fragment length heterogeneity, and imperfect typings for a few markers. Outliers are discussed in the text.