C. eleganspresents a low level of molecular diversity, which may be explained by its selfing mode of reproduction. Recent work on the
genetic structure of natural populations ofC. elegansindeed suggests a low level of outcrossing, and little geographic differentiation because of migration. The level and pattern
of molecular diversity among wild isolates ofC. elegansare compared with those found after accumulation of spontaneous mutations in the laboratory. The last part of the chapter
reviews phenotypic differences among wild isolates ofC. elegans.

Most studies on C. elegans use a single genetic background, that of the N2 strain. Its biological features as unravelled in the laboratory may be better
interpreted with more knowledge of their evolutionary origin and ecological significance. Moreover, the diversity of wild
isolates is a source of phenotypic variations that can be studied like mutant phenotypes.

In addition, the mode of reproduction of C. elegans makes it a great system to study the regulation, consequences and evolution of outcrossing versus selfing breeding systems.
Obvious questions concern the frequency of outcrossing, the geographic structure of populations (are populations locally clonal?)
and the fate of deleterious mutations. Although evolutionary and ecological studies of C. elegans have been scarce until a few years ago, its short generation time, the ease of obtaining isogenic lines by selfing and the
ability to retain frozen stocks make it a model system with a future in evolutionary biology.

We will successively review what is currently known about I. its molecular diversity and the genetic structure of populations,
and II. its phenotypic diversity.

1. Molecular polymorphisms and population genetics

Early work on molecular polymorphisms in C. elegans was based on a set of natural isolates that were collected worldwide over many years, each kept as a single selfing isogenic
strain (available at CGC). These isolates come from North America (mostly California), Western Europe (including N2, the reference
strain, isolated in Bristol, U.K.) and Adelaide (Australia); one isolate, CB4856, was sampled in Hawaii and is used for SNP
mapping (see SNPs: introduction and 2-point mapping; Hodgkin and Doniach, 1997; Figure 1). These isolates are used as representatives of the molecular and phenotypic variation within the species, even though they
may miss a signifcant part of it, due to geographically restricted sampling. C. elegans could never be isolated in samples from Asia (unlike C. briggsae), but not much sampling was attempted in Africa nor South America.

1.1. Low molecular diversity of C. elegans

The level and pattern of genetic diversity depend on the rate and pattern of mutation, which can be measured after accumulation
of spontaneous mutations in the laboratory. Mutation accumulation (MA) lines are established by picking a single individual
at each generation, therefore limiting natural selection as much as possible (Figure 2). They represent the raw mutational process, whereas the diversity observed among wild isolates is the result of mutation
as well as population structure and phenotypic selection. We first review for mitochondrial and nuclear DNA the observed molecular
diversity in wild isolates, then the mutation rate and pattern in MA lines, and compare them to reveal the action of selection
in natural populations.

Figure 2. Mutation accumulation lines. A. Mutation accumulation lines are obtained by transferring a single larva at each generation for many generations. Spontaneous
mutations accumulate, with minimal selection. Many lines are kept in parallel, and can be frozen at intervals to keep a record
of their evolution. Due to the accumulation of highly deleterious mutations, some lines are lost: from 100 original lines,
only 58 were kept up to 396 generations (Vassilieva et al., 2000). B. Frequency distribution between the different mutation accumulation lines of a fitness measure (the intrinsic rate of
increase) after 214 generations (Vassilieva et al., 2000). C. Frequency distribution of body size between the same lines at an average of 152 generations (Azevedo et al., 2002).

1.1.1. Polymorphism in the mitochondrial genome

Denver et al. (2003) sequenced the nearly complete mitochondrial genome of 27 C. elegans natural isolates. Mitochondrial sequences were found to cluster in two well-defined clades, called I and II (Figure 3). N2 and CB4856 both belong to clade I. Several isolates have the same mitochondrial DNA sequence as N2.

The mutation rate and pattern of mitochondrial DNA were assessed in mutation accumulation lines (Denver et al., 2000). 26 mutations were recorded, 16 of which were base substitutions, yielding a rate of 9.7 × 10-8 mutation per site per generation (Table 1).

*: mutation rate is per base pair per generation for mitochondrial and nuclear sequences, and per locus per generation for
homopolymers and microsatellites.

Figure 3. Phylogenetic analysis of C. elegans hermaphrodite lineages using nuclear and mitochondrial DNA sequences. Phylogenetic analysis of C. elegans hermaphrodite lineages using mtDNA and nuclear sequences. Bootstrap values (maximum parsimony and maximum likelihood) are
indicated at each node. AB1 is the only strain whose placement differs in the two trees, suggesting that it is the product
of recombination between two divergent genotypes. A. Bootstrap consensus tree for C. elegans natural isolates, using 70 nuclear variable sites from Denver et al. (2003) and Koch et al. (2000). Note that in the latter set, most SNPs were found by sequencing CB4857. B. Unrooted phylogram of the C. elegans natural isolates using 11,443 bp of mitochondrial DNA sequences. From Denver et al. (2003).

The comparison of the mutation pattern in MA lines versus wild isolates suggests that the excess of transitions over transversions
in wild isolates is due to a mutational bias. In contrast, indels and replacement mutations in coding regions are less frequent
among wild isolates than in MA lines, suggesting that they are counterselected (Table 1; Denver et al., 2000).

1.1.2. Polymorphism in the nuclear genome revealed by shotgun sequencing or AFLP

Whole-genome diversity was also investigated using AFLP (Amplified Fragment Length Polymorphism), a whole-genome fingerprinting
technique, on four populations from France (Barrière and Félix, 2005). For sampling, individuals (12–19/location) were picked as they crawled out of the substrate, avoiding as much as possible
post-sampling selection and ensuring a true representation of natural population diversity. Across the four French populations,
the AFLP fragment diversity is Hj = 4.9 (SE 0.9). An approximate conversion to nucleotide diversity gives a π = 0.81 × 10-3 per nucleotide (Table 3).

Molecular diversity can be expressed as a pairwise distance between two genomes (Table 2). When more than two strains are compared, the standard measure of genetic diversity (Nei's genetic diversity) is the probability
of two randomly sampled alleles to be different for the type of polymorphism considered (for example, nucleotide diversity,
noted π). This measure takes into account the frequency of each allele in the population of samples. The conversion of AFLP fragment
diversity to nucleotide diversity is approximate, but does not affect the ratio of local vs. global diversity estimates.

The nuclear mutation rate was estimated in MA lines to 2.1 × 10-8 per nucleotide site (or 2.1 mutations per haploid genome) per generation (Denver et al., 2004). 17 out of 30 observed mutations were indels, among which insertions were dominant (13 out of 17). This contrasts with the
pattern observed among wild isolates in putatively neutral regions such as transposon pseudogenes, where deletions are 2.8
times more frequent than insertions (Witherspoon and Robertson, 2003); therefore, data obtained on putative pseudogenes should be taken carefully, as selection may not be fully suppressed. It
is also possible that the MA data, obtained in the N2 genetic background, are not representative of the mutational pattern
that gave rise to the present diversity in the species.

Stewart et al. (2004) sequenced in 20 wild isolates 31 chemoreceptor genes from the srh and str families that are pseudogenes in the N2 strain. For 10 of these genes, functional alleles are present in some wild isolates.
Half of these genes apparently became non-functional by a substitution creating a stop codon, and half through a deletion
affecting the reading frame. Five of the putatively functional alleles were only found in CB4856 and JU258. The other polymorphisms
observed in the same gene sequences also make CB4856 and JU258 the most divergent strains compared to N2, with 25 and 14 polymorphisms,
respectively (compared to 4 polymorphisms in Bergerac/RW7000, the third most divergent strain). Overall, chemoreceptor genes
appear to have a 5- to 10-fold higher level of polymorphism than the whole genome average, and different genes showed widely
different polymorphism levels, from 15 polymorphisms in 9.4 kb to none in 17 kb.

1.1.4. Microsatellite polymorphism

Microsatellites are short repeated sequences (ranging from two to six nucleotides), which are highly prone to slippage during
replication, and therefore highly mutable. Microsatellite mutation rates in MA lines range from 8.93 × 10-5 to 1.85 × 10-2 per locus per generation, depending of the size and nature of the repeats (Frisse, 1999; Sivasundar and Hey, 2003). Mononucleotide microsatellites, also called homopolymeric loci, are sequences of eight or more repeats of a single nucleotide;
30 length change mutations and one base substitution were identified in homopolymers in MA lines (Denver et al., 2004), giving a mutation rate range from 4.5 × 10-6 per generation for A/T homopolymers to 1.5 × 10-4 per generation for long G/C homopolymers (Table 1). 2% of homopolymers are found in protein-coding sequences, which results in an average of 0.03 homopolymer mutation in exon
sequence per genome per generation.

Sivasundar and Hey (2003) genotyped 20 dinucleotide microsatellite loci in 23 wild isolates of C. elegans. Only half of these loci were polymorphic, with a mean of 6.2 alleles (range 3–12). The low diversity (or lack of it) in
some microsatellites correlates with the low number of true repeats, as pointed out by Haber et al. (2005).

Haber et al. (2005) sequenced 10 tri- or tetra-nucleotide repeats in 58 natural isolates. All but one were found to be polymorphic, with an average
allele number of seven.

1.1.5. Effective population size

An important parameter in population genetics is the effective population size (Ne), which represents the number of individuals of an equivalent ideal random-mating population under mutation and genetic drift,
or the number of individuals contributing to the next generation. It is calculated from the mutation rate μ and the observed
molecular diversity π from π = 4Neμ. For a given mutation rate, it is thus equivalent to a measure of molecular diversity. Practically, Ne is reduced by inbreeding or non-random mating due to geographical structure, and is almost always smaller than the number
of breeding individuals (Hartl and Clark, 1997; Charlesworth et al., 2003).

Estimates of Ne from microsatellites for the whole species range from 200 to 44,000, depending on which locus is considered (Sivasundar and Hey, 2003). From AFLP data, it was estimated to be 9,600 for the four samples over France (Barrière and Félix, 2005). These numbers are surprisingly small for such a widespread species with large local populations (densities of C. elegans can reach 10 individuals / g in a compost heap; Barrière and Félix, 2005; and on decaying mushroom compost numbers are far larger; A. Cutter and E. Dolgin, pers. comm.).

1.1.6. Conclusion: Low genetic diversity of C. elegans

At least in the locations sampled so far, C. elegans presents a low genetic diversity, comparable to that of humans and 20-fold lower than that found in Drosophila melanogaster. Its molecular mutation rate is not particularly low. The level of genetic diversity depends on the structure of natural
populations. For example, Homo sapiens has a low nucleotide diversity, due to a recent bottleneck in the population. The pattern of microsatellite diversity of
C. elegans does not show evidence for a recent global bottleneck followed by population expansion (Sivasundar and Hey, 2003). The reproductive mode also influences genetic diversity, which is expected to be reduced in partial selfers such as C. elegans or Arabidopsis thaliana: selfing results in a two-fold reduction in effective population size, and genetic diversity is further reduced by the fact
that selection at one locus results in the selection of the entire genome under genetic linkage (Hartl and Clark, 1997). We will now review the geographic structuring and the reproductive mode of C. elegans.

1.2. Population structure and geographic differentiation

1.2.1. Diversity at different spatial scales

When the natural isolates are partitioned by world region, most of the genetic variation is found within a continent: for
example, the microsatellite genetic diversities within North America and Europe are of 0.83 and 0.85, respectively, which
is close to the 0.92 diversity over the whole dataset (Haber et al., 2005).

At a local scale, Haber et al. (2005) and Sivasundar and Hey (2005) found microsatellite polymorphism within compost heaps in Germany, and soil samples in California, respectively. Effective
population size estimates range from 750 to 12,000 in different local populations. In these two sets of samples, individuals
were isolated after several (five to 24) days in the laboratory and may therefore have been the progeny (F1–F5) of wild individuals
and thus not faithfully represent diversity in the corresponding natural populations.

Barrière and Félix picked independent wild-born individuals directly from the substrate. Using AFLP analysis, estimates for
within-population nucleotide diversity range from 0.02 to 0.43 × 10-3 per nucleotide for different populations (Ne = 200–5,100; Table 2). The within-sample diversity is spread over a large range, from almost null to a level of diversity comparable to the global
diversity (and to the distance between N2 and CB4856). In cases where the intra-sample diversity was low, a diversity of 0.80
× 10-3 was recovered when three samples were sampled 10–30 meters away (Barrière and Félix, 2005).

Thus, local diversity seems to remain high until the scale of a few cm3. In some samples, it then breaks down, probably as a consequence of local bottlenecks followed by selfing of a single individual
(see section 1.4). The large variance in within-sample diversity may also reflect a lack of demographic stability in some populations, with
transient bottlenecks that reduce genetic diversity. Together with known ecological features of C. elegans, this suggests that a metapopulation dynamics (many small populations, experiencing frequent extinctions and de novo colonizations) could apply to its natural populations.

1.2.2. Geographic structure of populations

Local diversity may be the result of local mutation or of input of alleles through migration, which can be distinguished by
measuring the frequency of different alleles in different populations.

Allelic frequencies correlate only weakly with geographic origin among the sampled world regions (Western Europe, North America,
Australia, Hawaii, Madeira; Sivasundar and Hey, 2003; Haber et al., 2005). Indeed, isolates from the same region can be significantly divergent, whereas very similar genotypes are found on different
continents (Hodgkin and Doniach, 1997; Koch et al., 2000). At a smaller scale in the four populations sampled in France, for more than half of polymorphisms within each population,
both alleles are also found outside this population (Barrière and Félix, 2005), suggesting that local diversity is fostered by either a large population size at population foundation, or later migration.
Moreover, no correlation was found between geographical and genetic distances (Sivasundar and Hey, 2003; Barrière and Félix, 2005), thus further suggesting that the migration rate is large enough that isolation by distance does not contribute substantially.

These results suggest that C. elegans is a better migrant than previously thought. The dauer larva is likely to be a dispersal stage, through its association with
other animals (see The phylogenetic relationships of Caenorhabditis and other rhabditids). Because C. elegans is easily found within gardens and compost heaps, human activity may also be a major factor in its present migration patterns
and distribution.

Two wild isolates however stand out in terms of level of divergence and specific alleles, namely CB4856 and JU258. 70% of
the SNPs identified in CB4856/Hawaii were not found in nine other strains, indicating that this strain has significantly diverged
(Koch et al., 2000). JU258/Madeira was not included in the earlier analyses, but now appears to share specific polymorphisms with CB4856 (Stewart et al., 2004; Haber et al., 2005). Much diversity may still be in store in undersampled parts of the world.

Graustein et al. (2002) compared nucleotide diversity of tra-2, glp-1 and COII in three species (Figure 4). The two selfing species, C. elegans and C. briggsae, show low levels of diversity, with C. briggsae being more polymorphic than C. elegans, while the outcrossing species, C. remanei, is highly polymorphic, from one to two orders of magnitude above C. elegans.

Figure 4. Silent site nucleotide diversity for three genes (COII, glp-1, and tra-2) in C. elegans, C. briggsae, and C. remanei. Open circles represent estimates of silent site nucleotide diversity that are based on pairwise sequence comparisons. Error
bars represent the 95% confidence intervals for this estimate. The self-fertilizing species (C. elegans and C. briggsae) always harbor less variation than the cross-fertilizing species (C. remanei). This difference between mating systems is more pronounced for the two nuclear genes (glp-1 and tra-2) than for the mitochondrial gene (COII). From Graustein et al. (2002).

Similarly for the odr-3 gene, nucleotide diversity was found to be 0.08 × 10-3 per nucleotide in 10 C. elegans isolates, 1.2 × 10-3 in C. briggsae, and 12.9 × 10-3 in C. remanei, including one amino-acid replacement. The diversity within a single local C. remanei population is two orders of magnitude higher than in the whole C. elegans species (Jovelin et al., 2003).

Interestingly, other more distant hermaphroditic nematode species, Pristionchus pacificus (Srinivasan et al., 2001) and Oscheius tipulae (D. Baïlle and M.-A. F., unpublished), show much higher nucleotide diversity than C. elegans and C. briggsae. The mode of reproduction is clearly not the sole determinant of genetic diversity, and population dynamics is likely to
be an important factor.

This low intraspecific diversity contrasts with a large divergence between the three species: 9.4% between C. elegans and C. briggsae, and 11.6 (Jovelin et al., 2003) to 11.9% (Thomas and Wilson, 1991) between C. briggsae and C. remanei, about 10-fold higher than intraspecific variation of C. remanei. The turnover of nucleotides at neutral sites is almost saturated between these different species (Shabalina and Kondrashov, 1999), which renders molecular evolution tests based on synonymous mutation rate difficult. On the other hand, polymorphism within
C. elegans is so low that the amount of sequence information needed to look at DNA polymorphism patterns is large, and that for a given
gene may not be very meaningful.

1.4. Outcrossing versus selfing in C. elegans

C. elegans reproduces through self-fertile XX hermaphrodites and facultative XO males (see Introduction to sex determination). Hermaphrodites first produce sperm, then oocytes, through meiotic divisions; hermaphrodites cannot cross-fertilize and
males are required for outcrossing to occur. In laboratory conditions, hermaphrodites are limited by their sperm production
to about 300 self-progeny, whereas males may have a much larger number of progeny. In outcrossing events, male sperm is more
efficient in mating than hermaphrodite sperm, and in C. briggsae (AF16 strain), X-bearing male sperm is more efficient than non-X-bearing male sperm (LaMunyon and Ward, 1995; LaMunyon and Ward, 1997).

The outcrossing frequency depends on the rate of spontaneous male production and the mating efficiency. Males can be produced
by i. non-disjunction of the X chromosome at meiosis; ii. mating of a XO male with a XX hermaphrodite.

1.4.1. Outcrossing in the laboratory

In laboratory conditions, spontaneous non-disjunction of the X chromosome occurs at a low rate (about 1/1,000), which is subject
to genetic variation (Hodgkin and Doniach, 1997) and is altered by environmental factors such as temperature (Nigon, 1949).

Males from other strains of C. elegans, such as CB4855 and CB4856, have higher mating ability (Hodgkin and Doniach, 1997; J. Hodgkin, pers. comm.). Moreover, we cannot generalize from these observations to wild individuals in their natural environment.

1.4.2. Outcrossing in natural populations

Outcrossing in the wild can be detected by several methods:

Recombination of alleles at different loci

The most sensitive method for detecting outcrossing is based on reassortments of alleles at different loci in the genome.
If C. elegans only reproduced by selfing, different loci would follow the same tree lineage as mitochondrial DNA, and the whole genome
would behave as a single genetic linkage block. Indeed, a partial but significant correlation between phylogenies built from
either nuclear or mitochondrial sequence data (Figure 3) suggests that outcrossing occurs, but is rare in C. elegans (Denver et al., 2003).

Linkage disequilibrium is a quantitative measure of association between polymorphic loci in a population; since the recombination
frequency increases with outcrossing, linkage disequilibrium decreases. In Drosophila melanogaster populations, linkage disequilibrium only extends for 100 bp-1 kb. Within local C. elegans populations, linkage disequilibrium was found to be high for randomly chosen loci in the samples from Northwest Germany (Haber et al., 2005) and France (Barrière and Félix, 2005). The outcrossing rate can be estimated from linkage disequilibrium and effective population size. In two populations sampled
in France, it was estimated to one outcrossing every 7,400 (95% confidence interval: 6,000–19,000) and 17,400 (95% c. i.:
1/3,200-complete selfing) generations, suggesting exceptional outcrossing.

Given the frequency of two alleles at a given locus in a population, the outcrossing frequency can be calculated from the
observed frequency of heterozygotes (assuming non-assortative mating of different genotypes).

(Sivasundar and Hey, 2005) isolated C. elegans individuals using a RNA interference-based assay (performed several days after soil sample collection) and measured heterozygote
frequency using microsatellite polymorphisms. From allele frequencies and the frequency of heterozygotes, they estimated the
outcrossing rate in the wild ancestors at the time of sampling. For different populations sampled in Southern California,
estimates range from complete selfing to one outcrossing every 5 generations, with a rate of one outcrossing every 7 generations
over the whole dataset (i.e. an outcrossing rate of 14%).

Among individuals directly sampled from four natural populations from France, heterozygotes could also be detected using microsatellites,
and the overall outcrossing rate estimated to 1% (Barrière and Félix, 2005). The difference between both studies may possibly result from variations between natural populations, either in the ecological
context (for example seasonal or geographical variations in outcrossing rates) or in genetic variation in spontaneous-male-occurrence
or fertility.

These heterozygote-based estimates do not seem compatible with the much lower estimates inferred from linkage disequilibrium
data, and with the partial congruence of mitochondrial and nuclear phylogenies. However, the latter methods measure outcrossing
over a longer time period, taking into account population structure, changes in selfing rate over time and possible mating
preferences.

Presence of males

Outcrossing results in a high frequency of males in the next generation, since half of the cross-progeny is male. A survey
of individual worms sampled directly from the wild (13 samples on 3 sampling sites over 6 months) found only two males out
of 1,135 individuals (Barrière and Félix, 2005). This very low occurrence of males is fully compatible with the rate of spontaneous male production in the laboratory. These
populations thus reproduce mostly by selfing, but occasional bursts of sexual reproduction would remain undetected by this
criterion.

1.4.3. Conclusion: A world traveler with rare sexual encounters

Overall, these results suggest that C. elegans overwhelmingly reproduces by selfing in the wild and that outcrossing is only occasional. Most likely, males produced by
spontaneous X-chromosome non-disjunction only rarely mate. Since male mating results in more males, sexual reproduction in
small populations is likely to occur by unstable bursts of variable duration. Recombination of genotypes is rendered possible
by the coexistence of diverse genotypes in the same location. Outcrossing is sufficiently infrequent relative to the mutation
rate that the genomes remain incompletely mixed over large chromosomal segments.

On an evolutionary timescale, the level of outcrossing has consequences. Since males of most C. elegans wild isolates retain the ability to mate (Hodgkin and Doniach, 1997), even if not very efficiently, the maintenance of genes required for male-specific development and behavior seems to imply
the existence of outcrossing in the wild. On the other hand, C. elegans hermaphrodites produce much less male-attracting pheromone than females of other Caenorhabditis species, consistent with the fact that outcrossing is not required (Chasnov and Chow, 2002; Lipton et al., 2004).

Other theoretical consequences of a high degree of selfing are the rapid elimination of highly deleterious recessive mutations
(because they are not masked in the heterozygous state when they appear) and on the other hand the accumulation of slightly
deleterious mutations (“Müller's ratchet”). In addition to outcrossing, the rapid evolution of compensatory mutations (see
below) could help to eliminate the phenotypic effect of deleterious mutations. The evolutionary feedback of phenotypes on
genotypes becomes highly relevant at this point.

2. Phenotypic diversity

Phenotypic evolution is the product of two processes, 1. the effect of mutational and environmental change on the phenotype,
followed by 2. the sorting of these phenotypes in the wild by selection and drift. As with DNA evolution, the effect of random
mutation on the rate and pattern of phenotypic variation can then be compared with the actual natural diversity.

2.1. Phenotypic change upon mutation accumulation

2.1.1. “Deleterious” mutations: a fitness measure in the laboratory

Any operational definition of fitness is only partial and may not include the most relevant parameters in the wild, which
vary with ecological contexts. In the laboratory, a “fitness” parameter can be computed from measures of progeny number and
generation time. This laboratory “fitness” decreases at a rate of about 0.1% per generation in mutation accumulation lines,
which is slower than in Drosophila (Keightley and Caballero, 1997; Vassilieva and Lynch, 1999; Vassilieva et al., 2000; Figure 2). Knowing the fitness decrease the molecular mutation rate and the proportion of constrained nucleotides as estimated from
a genome comparison with C. briggsae, it appears that most deleterious mutations that are eliminated in the wild cannot be detected in the laboratory, either
because they have a very mild deleterious effect that would nevertheless be relevant in large populations or because they
have a stronger effect in another environment (Davies et al., 1999; Denver et al., 2004; Estes et al., 2004).

Very strikingly, mutation accumulation lines recover remarkably rapidly when mutation and competition between resulting genotypes
are allowed through a population expansion. This rapid recovery over 10-80 generations most likely occurs through compensatory
extragenic mutations (Estes and Lynch, 2003). A population undergoing successive bottlenecks and expansions is thus able to recover despite the fixation of deleterious
alleles by random drift during the bottlenecks.

2.1.2. Specific phenotypes

Body size changes rapidly upon mutation of the N2 genome and most mutation accumulation lines show a decrease in body size
compared to N2 (Figure 2), as also observed after EMS-mediated mutagenesis, systematic RNAi screens or artificial selection experiments (Azevedo et al., 2002). Thus, from a given genotype, some phenotypic changes are easier than others. Such patterns of spontaneous variation can
then be taken into account when studying variations in natural populations (Dichtel-Danjoy and Félix, 2004).

2.2. Polymorphic phenotypes

Even though C. elegans is not very diverse at the molecular level, wild strains show phenotypic variation among them in laboratory conditions. Because
of the paucity of ecological data, their relevance in the natural context is at this point poorly understood.

Note that the N2 strain has been maintained in the laboratory for years before being frozen and may not be the best representative
of a natural C. elegans isolate. The Bergerac derivatives were useful when molecular polymorphisms were mostly detected by transposon-tagging, yet
are not the current best choice, because their (sick) phenotypes may not be more representative of natural variation than
those induced in the laboratory. Indeed, cultures have been maintained for over 20 years before freezing and show different
phenotypes (Hodgkin and Doniach, 1997). More generally, any phenotype found in a single wild isolate may not be very relevant.

“Laboratory fitness” and a tradeoff between progeny number and generation time

Progeny number and generation time under standard laboratory conditions vary among wild isolates (Abdul Kader and Côté, 1996; Hodgkin and Doniach, 1997). The brood size of the selfing hermaphrodite corresponds to the number of sperm produced prior to the irreversible onset
of oogenesis. This results in a tradeoff between sperm number and generation time: the more sperm produced, the longer the
generation time (Hodgkin and Barnes, 1991; Abdul Kader and Côté, 1996; Cutter, 2004). It remains to be assessed whether these phenotypic measures are relevant in the wild; moreover, the selection pressures
on brood size may vary with ecological conditions.

Many male-female species (including Caenorhabditis spp.) show an inbreeding depression when low-frequency recessive deleterious mutations are rendered homozygous by inbreeding,
and conversely, hybrid vigor (heterosis) when two strains are crossed. At this day, such a phenomenon was not observed in
F1 hybrids between N2 and Bergerac for growth rate, fertility or longevity (Johnson and Hutchinson, 1993). Conversely, it would be interesting to study whether the divergence of some isolated populations (for example on islands)
may result in an “outbreeding depression” (partial speciation) if the genomes are artificially recombined.

Egg-laying: When well-fed, N2 hermaphrodites lay embryos at mid to late gastrulation, whereas many other isolates lay their eggs at
an earlier stage (M.-A. Félix, unpublished).

The best studied polymorphism in C. elegans concerns the Clumping and Bordering behaviors: in some C. elegans isolates such as CB4856, adults tend to aggregate on the border of the E. coli lawn of standard culture plates (whereas N2 adults aggregate once the plate is just starved; Hodgkin and Doniach, 1997; de Bono and Bargmann, 1998; Figure 5; see Aggregation). This phenotypic polymorphism is caused by a genetic polymorphism at a single locus, called npr-1, which encodes a neuropeptide Y receptor homolog. The solitary feeders such as N2 possess a form of the receptor with reduced
activity, whereas the Clumpers have a dominant gain-of-function variation in the amino-acid sequence (de Bono and Bargmann, 1998). Other Caenorhabditis species carry the N2 form of the receptor (Rogers et al., 2003), although some clump, suggesting that the same phenotype may be obtained through variation at another locus (M. de Bono,
pers. comm.). Most strikingly, the evolutionary change within C. elegans is a gain of function of the NPR-1 protein. The ecological relevance of the Clumping phenotype as seen in the laboratory is not clear at this point. It is likely
that the effect is mediated by the lower oxygen levels at the border of the bacterial lawn and in a worm group (Gray et al., 2004). Variation in npr-1 also affects recovery of worms upon exposure to high ethanol levels (Davies et al., 2004).

RNAi and transposons: The Hawaiian strain CB4856 is partially resistant to RNAi inactivation of germ line-expressed genes, due to variation at
one major locus, ppw-1, which encodes a PAZ/PIWI-domain-containing protein (Tijsterman et al., 2002; see RNAi mechanisms). CB4856 displays several molecular polymorphisms in the ppw-1 gene, including a one-base pair deletion that results in an early stop codon (Tijsterman et al., 2002). The phenotypic significance in the wild is unknown, but may be related to transposon activity.

Transposon number and/or activity in the germ line are elevated in some “mutator” wild strains compared to N2 (Anderson, 1995; Egilmez et al., 1995; see Transposons in C. elegans). Some of the chromosomal regions with high transposon number are retained in several wild strains, suggesting that they
are not necessarily an evolutionary deadend.

Resistance to various environmental hazards: Resistance to a toxic environment is likely to be a fast-evolving and polymorphic character. For example, a population sampled
in a South California desert shows an unusual resistance to high temperatures, apparently due to the balancing of one heterozygote
locus (Hodgkin and Doniach, 1997). This natural population has unfortunately not be resampled since.

Pathogens are likely to be a major survival issue for C. elegans in the wild (see Interactions with microbial pathogens and Ecology of Caenorhabditis species). C. elegans strains differ in their response to the pathogenic bacterium Bacillus thuringiensis in terms of survival rate, infection levels, pumping rate and evasion behavior (Schulenburg and Muller, 2004). Resistance to the bacterium Serratia marcescens is found to be polymorphic within a local population of C. elegans and different strains of Serratia marcescens specifically affect different C. elegans strains (Schulenburg and Ewbank, 2004): it is possible that both species coevolve in an evolutionary arms race.

The cuticle composition varies between wild isolates of C. elegans, as can be detected by specific antibody staining. The variation maps at a locus called srf-1 (Politz et al., 1987) and a variant allele results in a weaker response to the cuticle-bound pathogenic bacterium Microbacterium nematophilum (Hodgkin et al., 2000).

Plugging. In many animal species, males leave a copulatory plug on the female's genitalia after mating, which is often thought to
prevent remating. N2 males do not leave a plug, whereas males of several other wild isolates of C. elegans do (Hodgkin and Doniach, 1997; Figure 5). This plug does not fully prevent remating by another male, but makes it less probable (Barker, 1994); it is possible that the plug has other unknown consequences. Polymorphisms in Plugging are found within local populations
(Barrière and Félix, 2005). The phenotypic polymorphism between N2 and several Plugging strains is the result of allelic variation at one major locus
called plg-1, which is yet to be identified at the molecular level; the N2 phenotype is recessive (Hodgkin and Doniach, 1997).

Dauer entry and exit. Entry into the dauer stage and exit from it are key developmental decisions in the life of C. elegans (see Ecology of Caenorhabditis species). The sets of environmental cues that influence them and the doses at which they operate and interact with each other vary:
for example, N2 is more sensitive to dauer pheromone than some other strains (Viney et al., 2003).

In C. briggsae, the morphology of the male tail sensory rays varies within and among different strains: in the reference strain AF16, ray
3 is displaced posteriorly (compared with the C. elegans pattern) and rays 3 and 4 are fused in nearly all animals, whereas in some other strains of C. briggsae only about half of the males show ray 3–4 fusion (Baird, 2001).

Penetrance of mutations

When laboratory mutations are mapped using SNPs of the CB4856 strain, it is a frequent finding that the penetrance is greatly
modified by the CB4856 background, which is usually bad news for the mapper, but demonstrates an underlying evolutionary change
in relevant genes.

2.3. Genetic analysis of a natural phenotypic variation

Phenotypic variation between two wild isolates may be due to variation at a single major locus, in which case it can be studied
genetically like a laboratory mutant phenotype. It is then relatively easy to determine the nature of the gene and the type
of alteration (cis-regulatory or protein change, loss or gain-of-function), as was done for the npr-1 locus.

The phenotypic variation may however be due to genetic variation at more than one major locus. This can be deduced either
from a phenotypic analysis of the F2s of a cross, or using Recombinant Inbred Lines (RIL; Figure 6). Each RI line has a random combination of the two parental genomes in the homozygous state - a great simplification compared
to F2 analysis because it removes all heterozygous combinations and allows the study of frequency traits on an isogenic population.
If the variation is monogenic, each parental phenotype will be found in half of the RI lines. Otherwise, the proportions may
be altered, or intermediate or extreme phenotypes be found. With several loci, the different alleles may interact additively
or epistatically. For example, the variation in P3.p division frequency between the N2 and CB4857 or that between the C. briggsae strains AF16 and HK104 in male ray position are the consequences of genetic variation at several loci (Bonnaud et al., 2002; Baird et al., 2005; Figure 6).

Figure 6. Recombinant inbred lines and near isogenic lines. A. A set of n Recombinant Inbred Lines (RIL) is produced by picking n individuals in the F2 generation of a cross between two wild isolates, and letting them self for 10–15 generations until
the resulting strains are isogenic. RIL have been built between the reference strain N2 and a number of other wild isolates,
including Bergerac (Johnson and Wood, 1982; Ebert et al., 1993; van Swinderen et al., 1997), CB4856 (S. Wicks and R. Plasterk; M. Hammarlund & E. Jorgensen, pers. comm.), DR1350 (Viney et al., 2003), CB4857 (Bonnaud et al., 2002), etc. B. Frequency distribution of male ray 3 position (black bars) and ray 3-4 fusion (red bars) among Recombinant Inbred
Lines of the C. briggsae AF16 and HK104 strains (Baird et al., 2005). The phenotypes of the parental lines are indicated by arrows. The phenotypic variation is due to variation at several loci,
with transgressive segregation. In addition, the only partial correlation of the two phenotypes between the different lines
(not shown) show that they can evolve independently. C. A Near Isogenic Line is obtained by moving a chromosomal segment from
one genetic background to another by repeated crosses, which can be assisted by flanking markers. For example, a Plugging
plg-1(e2001)III strain in the N2 background was built in this manner (Hodgkin and Doniach, 1997).

Even if two wild isolates display a similar phenotype, the phenotype of some RI lines may lie outside the range of the phenotypic
distribution of both parents. This so-called transgressive segregation indicates that alleles from both parents at different
loci act in the same direction on the trait when combined (van Swinderen et al., 1997; Delattre and Félix, 2001) and thus indicates evolution in the molecular mechanisms that underlie the phenotype. For example, if the two parental isolates
have similar phenotypes, some of the lines may show a distinct phenotype, thus revealing cryptic genetic variation.

The genetic structure of populations (see above) and the genetic architecture of a polymorphic trait bear on how a polymorphism
appears, spreads and is maintained in wild populations. A polymorphism found in extant populations may be a transient quasi-neutral
polymorphism (which may be maintained for a long time because of population isolation), a recent and currently spreading variant,
or a polymorphism that is maintained because of selective advantages of both forms in specific ecological conditions.

*Edited by David H.A. Fitch. Last revised May 24, 2005. Published December 26, 2005. This chapter should be cited as: Barrière,
A. and Félix, M.-A. Natural variation and population genetics of Caenorhabditis elegans (December 26, 2005), WormBook, ed. The C. elegans Research Community, WormBook, doi/10.1895/wormbook.1.43.1, http://www.wormbook.org.