Avoiding Mistakes in Population
Modeling:Variability (Stochasticity)

Although deterministic measures such as predicted population size and
population growth rate may be useful under certain circumstances, variability
is such a pervasive feature of natural populations that ignoring it almost
always leads to an incomplete picture of the state of the population and an
incorrect prediction of its future.

Deterministic models make point (single-value) estimates of variables
such as future population size, based on point estimates (e.g., average values)
of parameters such as survival rates, fecundities, etc. In natural populations,
these parameters show variation in time and space, and it is not the average
values of these parameters that put the populations at risk; it is the lower
values they take (or the tails of their distributions). Therefore, ignoring
variability often gives a misleading, optimistic picture of a population's
viability.

In RAMAS Metapop, natural variability in parameter values (environmental
stochasticity) is modeled by random fluctuations in age or stage-specific
fecundities and survivor rates, carrying capacities, and dispersal rates, as
well as two types of catastrophes. In addition,
demographic stochasticity and observation error can be
simulated (see the Stochasticity dialog under the Model menu).

Demographic stochasticity is the sampling variation in the number of
survivors and the number of offspring that occurs (even if survival rates and
fecundities were constant) because a population is made up of a finite, integer
number of individuals. In RAMAS Metapop, demographic stochasticity is modeled
by sampling the number of survivors from a binomial distribution and the number
of offspring from a Poisson distribution (Akçakaya
1991). Relative to other factors, demographic stochasticity becomes more
important at small population sizes.

Demographic stochasticity option should be used for all models, unless
you are modeling densities (such as number of animals per km2)
instead of absolute numbers of individuals (however, in this case, note the
fact that the program always models abundance as an integer). It is especially
important to model demographic stochasticity when modeling impacts of habitat
fragmentation. A model that ignores demographic stochasticity will
underestimate the increased extinction risk when a population is fragmented
into several smaller populations.

If the standard deviation estimates you are using incorporate the
effects of demographic variability, it is more appropriate to estimate standard
deviation without the contribution by demographic stochasticity (see
below), than to exclude demographic stochasticity from
your model. This is because if the population size decreases in the future, the
component of observed variance due to demographic stochasticity in your data
will underestimate the variance due to demographic stochasticity in these lower
population sizes, thus your model will underestimate the risk of decline or
extinction of the population.

If the standard deviations are high, the results may be biased because
of truncation. In this case, selecting a lognormal
distribution instead of a normal distribution may be helpful. Lognormal
distribution is recommended if (i) any survival rate or fecundity has a small
average value with a large standard deviation, (ii) any survival rate has a
high average value with a large standard deviation. For a more detailed
discussion, see the help file and the manual.

Correlations among vital rates (elements of the stage matrix) increases
the variability of the population size, and hence increases the risks of
decline or increase. Thus, when correlations are not known, assuming full
correlation rather than independence gives results that are more conservative
(precautionary). Note that the correlation discussed here is the correlation
(across years) among the underlying vital rates, for example among the survival
rates (or fecundities) of different age classes; it is not the
correlation in the observed number or proportion of survivors (or offspring, in
the case of fecundities). The distinction relates to demographic stochasticity:
Even if the underlying survival rates have high correlation across years (e.g.,
between one-year old and two-year old individuals), the observed proportion of
survivors may have a low correlation, because the observed proportion will
include sampling variation (and demographic stochasticity), which by definition
is independent between age classes. Thus, the observed correlation may
underestimate the actual correlation if the sample size (or abundance in the
observed age classes) is low. (RAMAS Metapop assumes full correlation between
survival rates of different stages, and between fecundities of different
stages. By default, the program sets full correlation between survival rates,
fecundities and carrying capacities, which can be changed by the user).

Models with environmental stochasticity should have a sufficient number
of replications to adequately simulate the tails of the distributions of random
variables, and to estimate risks with sufficient precision. Use the maximum
number of replications (10000) unless you are running a test simulation or
making a demonstration. With 1000 replications, the risk curves have 95%
confidence interval of about ±0.03 (see the figure and discussion in
Chapter 11 of the manual on confidence intervals for risk curves).

However, one should also be careful not to be too confident because
there are a large number of replications. The confidence limits are purely
statistical; they do not tell anything about the reliability of your model. If
the model has large uncertainties, the risk results will
not be reliable, regardless of the number replications.

In some cases, the observed variability in a
population is part of a regular oscillation, or other fluctuation that has
periods longer than the time step of the model. Examples include multi-annual
predator-prey cycles, and long-term oscillations in demographic parameters
caused by global cycles such as ENSO. In many cases, annual fluctuations would
be superimposed on such cycles in demographic rates (see the figure on the
right for an example). When observed variability in a demographic rate includes
a long-term (multi-annual) cycle, using the total observed variance as the
variance of a short-term (annual) fluctuations in a model could give biased
results. Note that the cycles and fluctuations we are interested in here are in
vital rates (survival and fecundity); they are not cycles or
fluctuations in population size or density. This is an important distinction
(see below).

The correct way of incorporating an observed pattern of random
fluctuations superimposed on longer-term cycles is to model the long-term cycle
as a temporal trend in vital rate (using .SCH and .FCH files in the Populations
dialog), and to use the remaining (residual) variance to model short-term
(annual) fluctuations (environmental stochasticity as modeled in the "Standard
deviations" dialog). For an example of this approach used to model
predator-induced cyclicity, see
Akçakaya
et al. (2004b). See the program manual and help files for details about
temporal trends files.

This issue is related to the temporal autocorrelation in survival and
fecundity, i.e., whether deviations in a vital rate are correlated in time
(e.g., "bad" years more likely to be followed by "bad" years). Population
sizes are often temporally autocorrelated, because population size at any
time step is (at least in part) a function of the population size in the
previous time step(s). However, this does not necessarily mean that vital
rates are temporally autocorrelated. In an exponential
(density-independent) model, temporally uncorrelated vital rates ("white
noise") will result in a "random walk" pattern of temporally autocorrelated
population sizes, associated with spectral reddening. Most natural populations
show dynamics "halfway" between white noise and random walk (in population
size, not vital rates), a pattern that can be explained by white-noise
(uncorrelated) environmental fluctuations and a combination of weak to no
density dependence, age structure, and/or measurement (observation) error (see
Akçakaya
et al. 2003a). For these reasons, in RAMAS Metapop, temporal
autocorrelation is not explicitly modeled. However, autocorrelated
environmental fluctuations can be added as discussed above, by using temporal
trend files in addition to environmental stochasticity in the model.

Catastrophes are infrequent events that cause large reductions (or
increases) in population parameters. (Events that increase vital rates are
called "bonanzas", but both catastrophes and bonanzas can be modeled as
Catastrophes in RAMAS Metapop, using different "multipliers".) Some models
include catastrophes that would be better modeled as part of normal
environmental variability. Many factors that are modeled as catastrophes (such
as El Nino, fire, drought) form a continuum, and can be modeled either as
catastrophes or as part of normal environmental variability (perhaps
overlaid on top of a temporal trend), depending on
temporal and spatial scales. The best (perhaps the only) way to determine
whether to model such a factor as a catastrophe or as part of annual
fluctuations is to check the distribution of vital rates (or other population
parameter affected by the factor). If the distribution is bimodal (with some
years having distinctly different values), then adding the factor as a
catastrophe is justified (for example, see Figure 3 in
Akçakaya et al.
2003b).

If a model includes a catastrophe (e.g., a hurricane that lowers annual
survival rate), then the estimates of the mean and standard deviation of the
affected demographic rates (e.g., survival rate) should exclude the rates
observed during catastrophe years (e.g., see
Akçakaya
& Atwood 1997). Otherwise, the model would include the effects of the
catastrophe twice, overestimating its impact.

Some catastrophes have delayed effects. For example, the effects of a
fire that reduces the carrying capacity by destroying part of the available
habitat will last for several years until the habitat is fully restored. During
this time the carrying capacity may be gradually increasing to its pre-fire
level. Similarly, the effects of a toxic spill may last for several years,
during which average survival rates may be gradually increasing to their
normal level, even as they fluctuate from year to year. In RAMAS Metapop, such
effects can be modeled by a combination of catastrophes and temporal trends in
average population parameters (such as carrying capacities or survival rate).
In modeling such effects, it is important to correctly specify "Time since last
catastrophe" and "Reset vital rates" parameters. See the help file or the
manual for more details.

The standard deviation parameters to model environmental stochasticity
should be based on observed temporal variance. Variance components due to
sampling and demographic stochasticity must be subtracted from total observed
variance. Otherwise, standard deviations may be overestimated, which may cause
overestimated risks, as well as truncation (and,
consequently, bias) in vital rates.

If survival or fecundity estimates are based on age-structured census
data, subtract the expected demographic variance from the observed variance (Akçakaya
2002). The example used in this paper is worked out in
this
Excel file.

If the data are from a census of the total (or part of the)
population, use methods discussed by Holmes (2001,
2004).

If you have repeated or parallel estimates of the same population,
the covariance of the parallel measurements or the within-year variance
estimates can be used to subtract observation error from total observed
variance (see
Dunning
et al. 2002; and Morris & Doak 2002, Chapter
5).

The standard deviation parameters to model environmental stochasticity
should be based on the temporal variation in these parameters. When such
data are lacking, some modelers use estimates of spatial variation (or even
measurement error or sampling variation) from a single time step. There is no
reason to expect that spatial variance (or sampling variation) should be
similar to temporal variation.

Another mistake is to base the estimates on the standard error of mean,
rather than on standard deviation of the series of estimates of the vital rate.

When "Pool variance for survivals" option is selected in RAMAS Metapop
(in the "Advanced stochasticity settings"), the standard deviations must be
estimated in a specific way (see the help file).

If you are estimating fecundities as a product (e.g.,
maternity multiplied by zero-year-old survival
rate), remember that variance of the product of two random numbers is a
function of their means, variances and covariance (see
Akçakaya
& Raphael 1998 for an example).

Truncation of sampled values may introduce a bias, so that the realized
means may be different from the average value used as the model parameter.
(Note any error messages that the program
displays while running a simulation.) The most common causes of truncation (and
suggestions for correcting them) are listed below.

The standard deviations and means you entered are not based on data.
If you guessed the values of stage matrix and standard deviation matrix, you
may have overestimated them. The means and standard deviations should be based
on observed survival rates.

The standard deviations are large because the distribution of the
survival rates is bimodal. This may happen, for example, if males and females
in the same age class or stage have different average survival rates. In this
case, it may be better to include separate stages for males and females, or to
model only females.

Bimodality may also result from catastrophes, if the survival rates
in years with catastrophes are very different than those without catastrophes,
and they are combined in a single distribution. In this case, it may be better
to add catastrophes to the model explicitly, and separate their effects from
normal year-to-year variability.

The standard deviations are large because they include variability
due to sampling and measurement error and demographic stochasticity, and
spatial variability. In this case, the standard deviations estimates should
exclude these components (see above). Also, the "Pooled
variance for survivals" option (in "Advanced Stochasticity Settings") will
reduce truncations.

The distribution for environmental variation is Normal when there are
survival rates close to 0 or 1. Use lognormal distribution
instead.

The population you are modeling does not fit the assumption of the
program that all survival rates within a population are perfectly correlated.
The "Negative correlation for largest survival" option (in "Advanced
Stochasticity Settings") provides an alternative assumption.

As discussed elsewhere, constraints must
be imposed when sampling vital rates, especially survival rates. (RAMAS Metapop
does this automatically, as long as the "Constraints Matrix" is properly
specified.)

Even when constraints are imposed, demographic stochasticity may make
the number of individuals surviving from a given stage larger than the number
in the stage in the previous time step, thus creating "phantom" individuals.
This happens if there are two or more survival rates (to different stages) from
a given stage, the total survival rate is close to 1.0, and the number of
individuals is small. This is automatically corrected by RAMAS Metapop, but if
you are using another program, make sure a similar correction is made.

The effect of uncertainties in the model structure and parameters on
model results get compounded in time. In other words, the range of predicted
outcomes expands with time, so that for long time horizons (simulation
durations), the results may become too uncertain to be of any use. Long time
horizons also make it difficult to justify model assumptions (e.g., that the
vital rates will fluctuate around the same average values with the same
variability as was observed).

In contrast a simulation time horizon that is too short may yield
results that are not relevant or interesting. For example, when simulation time
horizon is less than the generation time of the species, the risk of extinction
or substantial decline may be very low, but this low risk may not be relevant
to the management question at hand.

The appropriate duration depends on the biology of the species
(especially its generation time), the amount of data (especially the length of
the time period from which data are available), and the question addressed
(especially whether an absolute or relative
prediction is required).

Confusion about the terminology related to uncertainty and variability
contributes to some modeling mistakes. As modeled in RAMAS Metapop,
natural variability results from temporal and spatial
environmental fluctuations, catastrophes, and demographic stochasticity.
Natural variability can be modeled and translated into risk (probability)
estimates using a stochastic model.

Model or parameter uncertainty results from
measurement error, lack of knowledge, and model misspecification and biases. It
determines model accuracy, and its effects on the uncertainty of model results
increases with time. In principle, this kind of uncertainty can be reduced by
collecting additional data. This type of uncertainty should be incorporated in
the form of ranges (lower and upper bounds) of each model parameter. For more
information, see the "Measurement error and Uncertainty" topic in the RAMAS
Metapop help file or manual.