Haldane's Dilemma and the substitution cost are a somewhat complicated
topics for which very, very few people in the general population have any
understanding. Although I have presented Haldane's mathematical description
of the substitution cost elsewhere, I wanted to provide a gentler introduction
to the concept that doesn't rely on the mathematics. Perhaps this introduction
to the substitution cost will make it easier to follow the more technical
descriptions of the cost.

What follows immediately is an outline six main points that I think
describe the substitution cost in (mostly) non mathematical terms. If you
understand these six points, I think you will have a pretty good idea of
what the cost is (at least as defined by Haldane - others have their own
definitions of the cost that depart somewhat from Haldane). Following the
outline, I describe each of the six points in detail and with examples
that I hope will make the points clear to anyone who is interested. Finally,
I will provide a light introduction to the mathematics of actually calculating
the substitution cost. This will only be an introduction to the math -
there will then be a link to the detailed mathematics for the calculation
of the substitution cost.

Outline of the Substitution Cost Definition

1. Start with a population of animals.
2. Detrimental change in the environment occurs.
3. A few individuals have a genetic variation that allows them to "overcome"
the detrimental change in the environment.
4. The genetic variation will spread through the population as more
of the offspring that posses the favorable genetic variation will survive
to adulthood each generation.
5. But in the meantime (while the favorable genetic variation is spreading),
a number of deaths will occur solely because of the change in the environment.
6. These deaths - that would not have occurred if the environment had
not changed - are what Haldane called the substitution cost.

Substitution Cost Definition

1. Start with a population of animals.Start with a population of a single species of animals.

2. Detrimental change in the environment occurs.A change in the environment occurs - perhaps there is a longer term
drop in rain fall or a lowering of temperature, a new predator begins to
prey on the animal, or the animal's food supply is reduced for some reason.
The important thing is that the change hurts the vast majority of the animals
in the population.

3. A few individuals have a genetic variation that allows them to
"overcome" the detrimental change in the environment.However, in this population of animals, there is a very small number
that have a genetic variation that allows its carriers to overcome the
disadvantage caused by the change in the environment. As an example, if
the detrimental environmental change were a decrease in the local temperature,
a genetic variation that led to its carriers having longer fur for better
protection against the cold would overcome the problems caused by the environmental
change. Similarly, if the detrimental change were the introduction of a
new predator, genetic variations that allowed the animal to see better
(to flee from the predator sooner), run faster, or have better camouflage
would once again help the animal overcome the bad effects of the environmental
change. Before the environmental change, the genetic variation was quite
rare in the population because it did not give any advantage or even may
have been a disadvantage. (Long fur that protects the animal after the
temperature drops was of no use when the temperature was warmer and in
fact may have been a disadvantage.) Even if the genetic variation was disadvantageous
to the animal (before the environmental change), it would have been kept
in place by mutations. Returning to our long fur example, in a warm environment,
short fur might be advantageous, but if there is a simple mutation that
produces long fur, there will always be a small number of animals in the
population that have the long fur because the mutation that causes long
fur will occur every once in a while (over many generations). The important
point is that there will be a few individuals in the population that will
have genetic variations that will protect the very few animals that have
them from the environmental change that hurts the animals in the population
that do not have that genetic variation.

4. The genetic variation will spread through the population as more
of the offspring that posses the favorable genetic variation will survive
to adulthood each generation.Now that the genetic variation is favorable (because of the environmental
change), it should begin to increase in the population with each generation.
This is because animals that have the variation will be more likely to
survive and pass the trait on to their offspring. With each generation,
the trait will become more common as more and more of the animals in the
population inherit the genetic variation from their parents.

5. But in the meantime (while the favorable genetic variation is
spreading), a number of deaths will occur solely because of the change
in the environment.Although animals that have the favored genetic variation will not be
harmed by the change in the environment, the vast majority of animals in
the population will not have this advantage and will be harmed. So, while
the variation increases a little bit each generation, in the meantime,
some of the animals that don't have the genetic variation will die each
generation. These deaths can be thought of as "extra" deaths - they are
deaths that would not have occurred if the environment had not changed
for the worse. Haldane showed that these deaths would be surprisingly high
if the favorable genetic variation increased somewhat slowly - typically
ten times the population size or more.

6. These deaths - that would not have occurred if the environment
had not changed - are what Haldane called the substitution cost.These deaths due to the environmental change are what Haldane called
the substitution cost. See quotes
from Haldane from "The Cost of Natural Selection" that show that this
was Haldane's definition of the substitution cost.

An Introduction to the Mathematics of the Substitution Cost

For the sake of simplicity, I will use a haploid organism for my mathematical
example. The concepts (definition of the substitution cost) are the same
as for the diploid case, but the calculations are simplified because there
is only a single inherited copy of each gene in the haploid case. Once
the substitution cost is understood for the haploid case, it should be
easier to follow the complications introduced in the diploid case.

All of this refers to the scenario used to define the cost above. The
genetic trait that is made favorable by the change in the environment (that
is unfavorable to all animals that do not have that trait) has a frequency
in the population. The frequency of the favorable trait in any generation
is p, and the frequency of the non favored trait is q. Since
we are considering only two traits (the favored trait and the now disfavored
trait), the sum of p and q will be 1 in each generation - that is, p +
q = 1 will always be true. As an example, if the population is 10,000,
and if there is only one individual in the population with the favored
trait (and thus 9,999 individuals that have the non favored trait), then
p = 1/10,000 = 0.0001; and q = 9,9999/10,000 = 0.9999.

The way Haldane used math to describe the increase in frequency of the
favored trait was to assign different "fitnesses" to the favored and non
favored traits. The fitness is simply the increase (or decrease) in the
frequency of each trait after selection each generation. For Haldane, the
actual values of the fitnesses didn't matter so much as the relative values
of the fitnesses for the two traits. So what he did was arbitrarily assign
a fitness of 1 to the favored trait (carriers of the favored trait, actually),
and a fitness of 1 - s to carriers of the disfavored trait. The variable
s is called the selection coefficient, and is a small, positive number,
typically less than 0.1. (In fact, Haldane indicated that the selection
coefficient should be less than 0.1 for his substitution cost calculations
to work out properly. Selection costs considerably higher than 0.1 have
been observed in nature, but it is not known if these are frequent enough
to have any real impact on the substitution cost in many cases.) Therefor,
a selection coefficient of s = 0.1 would lead to carriers of the favored
trait having a relative fitness of 1 and carriers of the disfavored trait
having a relative fitness of 1 - s = 0.9. These relative fitnesses can
then be used to predict the change of frequency of the favored and non
favored traits after a generation of selection.

At the beginning of the generation, the frequencies of the two traits
will be:

p (favored trait) and q (disfavored trait).

After selection, the frequencies of the favored and disfavored traits
will change to
p (because carriers of the favored trait had a relative fitness of
1) and (1 -s)q (because carriers of the disfavored trait had a relative
fitness of 1 - s).

After selection, there will be p + (1 - s)q multiplied by the population
size (the population size is often referred to by the mathematical symbol
N)
individuals in the population. Let's see what happens after a single round
of selection on our population from the starting conditions (p = 0.0001
and q = 0.9999, s = 0.1, and N = 10,000).

After selection, there will be [p + (1 - s)q]*N individuals in the population.
The new value of p in the next generation (because the frequency of the
favored genetic trait has increased due to natural selection) will be p'
= p/[p + (1 - s)q] = 0.0001/[0.0001 + 0.9*0.9999] = 0.0001111. The new
value of q (q') will be q' = (1 -s)q/[p + (1 - s)q] = 0.9*0.9999/[0.0001
+ 0.9*0.9999] = 0.9998889. In the next generation, p' will become the new
value of p and q' will become the new value of q. This process can be continued,
with p (the frequency of the favored trait) increasing a little each generation,
and q decreasing a little each generation, until eventually the favored
trait is "fixed" in the population (almost all of the animals in the population
carry the favored trait p = 1.0 ) and none of the animals carry the disfavored
trait.

Now, the thing that interested Haldane was the deaths due to selection
each generation. If the environment had not deteriorated, then the relative
fitness of the individuals carrying the trait that becomes disfavored (when
the environment does eventually change) would have a fitness of 1.0. Thus,
there is a decrease in fitness of those individuals carrying the disfavored
trait of 1 - ( 1 - s) = s. Haldane reasoned that this means that each generation
of selection lead to the loss of an additional s*q individuals (as a fraction
of the population size - use s*q*N if you want the actual number of individuals)
because of selection. This is what Haldane called the substitution cost,
these extra deaths due to natural selection. The number he was most interested
in was the sum of these selection costs over all generations of the selection.
These deaths are what added up to the high selection costs Haldane reported.
For the example we are looking at here (N = 10,000, initial p = 0.0001,
and initial q = 0.9999), Haldane calculated that the substitution cost
would be 9.21. This means that 9.21 times the population size animals would
have to die in order for the substitution to occur - that's 92,100 individuals
for our example population size of 10,000 individuals!

Hopefully the discussion above will make these detailed mathematical
descriptions of the substitution cost a little easier to follow.