This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Tetraena mongolica (Zygophyllaceae), an endangered endemic species in western Inner Mongolia, China.
For endemic species with a limited geographical range and declining populations, historical
patterns of demography and hierarchical genetic structure are important for determining
population structure, and also provide information for developing effective and sustainable
management plans. In this study, we assess genetic variation, population structure,
and phylogeography of T. mongolica from eight populations. Furthermore, we evaluate the conservation and management units
to provide the information for conservation.

Results

Sequence variation and spatial apportionment of the atpB-rbcL noncoding spacer region of the chloroplast DNA were used to reconstruct the phylogeography
of T. mongolica. A total of 880 bp was sequenced from eight extant populations throughout the whole
range of its distribution. At the cpDNA locus, high levels of genetic differentiation
among populations and low levels of genetic variation within populations were detected,
indicating that most seed dispersal was restricted within populations.

Conclusions

Demographic fluctuations, which led to random losses of genetic polymorphisms from
populations, due to frequent flooding of the Yellow River and human disturbance were
indicated by the analysis of BEAST skyline plot. Nested clade analysis revealed that
restricted gene flow with isolation by distance plus occasional long distance dispersal
is the main evolutionary factor affecting the phylogeography and population structure
of T. mongolica. For setting a conservation management plan, each population of T. mongolica should be recognized as a conservation unit.

Background

Genetic variation within and among natural populations is crucial for the long-term
survival of a species. An accurate estimate of the level and distribution of genetic
diversity of threatened species provides fundamental information in designing conservation
programs [1,2]. Tetraena mongolica Maxim, a monotypic genus of the Zygophyllaceae, is endemic to the western part of
Inner Mongolia around the basin of the Yellow River [3], and is also subjected as nationally endangered in China [4]. Plants of T. mongolica, up to 0.5 m in height, flower from mid-May till early June, and set fruits in July.
The species is restrictedly distributed in the western Gobi, the largest desert in
Asia characterized by extreme low annual rainfall [3], where T. mongolica with a fully developed root system is well adapted and becomes locally dominant. T. mongolica plays an important ecological role as windbreak for stabilizing river bank [5]. Nevertheless, it has been used as firewood, locally called as "oil firewood" because
its stems are combustible even in fresh state due to containing high levels of triacylglycerol
[6]. Human's overexploitation has inevitably caused a dramatic decline of the species.

Previous studies have been focusing on the biological characters causing the population
decline of T. mongolica. As the high rate of ovule abortion after anthesis [7], the seed-set of T. mongolica was quite low (1.3 - 2.8%) in the natural populations [8,9]. Previous population genetic researches based on allozyme and ISSR data revealed
medium levels of genetic differentiation among populations of T. mongolica [3,10].

Understanding levels and spatial partitioning of genetic polymorphisms in an endangered
species provides sufficient information for conservation practices. This kind of researches
has become increasingly popular in the recent years, with the development of analytical
methods to take phylogenetic distinctiveness into account when setting conservation
priorities [11,12]. During the past few decades, the theoretical framework of population genetics and
empirical data gathered with the help of molecular genetic methods have been widely
used in conservation biology [13]. Given a haploid nature and a low frequency of genetic recombination, molecular markers
of organelle DNA have been long used for phylogenetic reconstruction at various taxonomic
levels, conservation genetics, and assessing the migratory routes of species [14,15]. Although chloroplast DNA evolves relatively slowly, moderate to high levels of genetic
variation have frequently been detected in some noncoding spacers within and among
species [16-18]. With maternal inheritance [19], cpDNA is suitable for investigating processes associated with seed dispersal, such
as range expansions [20] and the contribution of seed movement to total gene flow [21,22].

For endemic species with a limited geographical range and declining populations, historical
patterns of demography and hierarchical genetic structure are important for determining
population structure, and also provide information for developing effective and sustainable
management plans [23]. In this study, we investigated genetic variation, population structure, and phylogeography
of T. mongolica from eight populations throughout the entire distribution range. Several aims are
pursued: 1) to examine the levels of genetic variation within and between populations,
2) to reconstruct phylogeographical patterns and examine the extent of genetic differentiation
among populations, and 3) to identify the conservation and management units based
on genetic evidence, to provide the information for the development of effective and
efficient conservation practices for this species.

Results

Genetic diversity and cpDNA phylogeny of T. mongolica

No within-individual variation was detected in the non-coding spacer between atpB and rbcL genes of the chloroplast DNA. Identical sequences were obtained from five clones
derived from the same amplification reaction, indicating no PCR artifacts caused by
Taq polymerase or sequencing errors. The atpB-rbcL intergenic region of cpDNA in T. mongolica varied from 872 to 880 base pairs (bp) in length. The cpDNA sequences were aligned
with a consensus length of 881 bp, of which 46 sites (5.2%) were variable.

The chloroplast spacer is A/T rich with an average content of 73.6%, which is consistent
with the nucleotide composition of most noncoding spacers and pseudogenes because
of low functional constraints [24]. In total, 38 haplotypes (GenBank accession numbers of HQ142910-HQ142986) were identified from 77 individuals of T. mongolica, with an estimated haplotype diversity of h = 0.962 ± 0.009 (Table 1; Figure 1). Except for the monomorphic population of YKBLG, haplotype diversity varied across
populations, ranging between 0.378 (GLS) and 1.000 (HN). Low levels of nucleotide
difference were detected within the whole species (θ = 0.00447 ± 0.0003) and within populations, ranging from θ = 0.00099 (TST) to 0.00405 (HN).

Table 1. Population locations, numbers of sample size and site coordinate of Tetraena mongolica,
the estimates of haplotype diversity (h) and nucleotide difference (θ) within populations
on cpDNA sequences.

A neighbor-joining tree obtained using MEGA recovered eight cpDNA clades (Figure 2). Most of the populations contained only one clade in the genetic composition, except
for the populations XD and HN. Apparently, most genetic variation resides between
populations of T. mongolica.

Figure 2.Neighbor-joining tree of Tetraena mongolica based on sequences of the atpB-rbcL intergenic spacer of cpDNA. Numbers at nodes are bootstrap values. See Table 1 for the acronyms of population
names.

Nested clade analysis, phylogeography and population differentiation

A nested clade analysis was accomplished by linking nucleotide haplotypes in a hierarchical
manner based on mutational changes. After linking the haplotypes into a clade, closely
related clades were linked further to form a higher-level group; via such hierarchical
linking, a nested network was drawn (Figure 3). In total, 38 haplotypes (H1-H38) and 8 clades, 1-1 to 1-8, were identified. The
distribution of haplotypes and clades in populations was indicated in Tables 2 and 3. Clades 1-1 to 1-3 were clustered into clade 2-1, which was further grouped with
clade 2-2 into a higher-level clade 3-1. Likewise, clades 2-3 and 2-4, which consisted
of clades 1-4 to 1-5 and 1-6 to 1-7, respectively, were clustered and nested with
clade 2-5 in the higher-level clade 3-2. Clades 3-1 and 3-2 corresponding to clades
G-H and A-E of the NJ tree, respectively, were identified. The former is distributed
in the southern part of the distributional range, while the latter is in the northern
part. The topology showing geographical division is shared by the minimum spanning
network and the NJ tree. Most populations were genetically differentiated, as most
haplotypes were private within populations except for H11, which were shared by populations
TST and HN (Table 2).

Table 2. Individual number and geographical distribution of haplotypes in populations of Tetraena
mongolica based on cpDNA data.

Figure 3.Minimum-spanning network based on mutations between haplotypes of the atpB-rbcL noncoding spacer of cpDNA of Tetraena mongolica.

A nested contingency analysis detected significant geographical associations within
several clades (2-1, 2-3, 2-4, 3-1 and 3-2) and the whole cladogram. The phylogeographical
inferences are listed in Figure 4. Most tip-clades were restricted to unique regions, whereas basal interior clade
1-4 was widespread (Table 3, Figure 3). The results agree with the hypothesis of constrained seed dispersal of the species.
The deduced Nm of 0.04-0.71 and FST of 0.38-0.90 indicated high levels of genetic differentiation between all populations,
with three exceptional pairs of HN-TST, HN-XD and XD-TST (Table 4). An "isolation by distance" model across eight populations of the species was supported
by a regression test between Nm values and geographical distance (R = 0.772, P < 0.05).

Table 3. Number of individuals and geographical distribution of clades in populations of Tetraena
mongolica based on cpDNA data.

Figure 4.Result of the nested clade analysis. Clade (Dc) and nesting clade (Dn) distances are given for each level of the nesting design. Superscripts refer to significantly small (S) or large (L) clade and nested clade distances. Inferences of current population structure and
population history based on nested clade analysis and the interpretation key given
in Templeton et al. (1995) are indicated at the bottom of the figure.

Relative values of Dc and Dn for each clade representing contemporary distributions of haplotypes were used to
interpret historical and contemporary gene flow processes (Figure 4). Restricted gene flow with isolation by distance was the primary process responsible
for the present-day distribution in Inner Mongolia (total cladogram), also inferred
for clade 2-3. While some other different inferences, like long distance colonization
and past fragmentation were detected at clade clades 3-1 and 3-2, respectively. This
result shows limited seed dispersal of this species, while with occasional long distance
dispersal [25].

Population demography pattern of T. mongolica

Historical population dynamics of T. mongolica was estimated using Bayesian skyline plots, a coalescent Markov chain Monte Carlo
method that does not require an assumed parametric model of demographic history. The
skyline diagrams, which summarize instantaneous estimates of effective population
size, showed recent population decline for T. mongolica over the last sixteen thousand years (Figure 5). The shape of the skyline diagram conforms to its known history. That is, habitat
destruction and misapplication of human activity and the frequent flooding of the
Yellow River may have caused a decline in T. mongolica population size.

Discussion

Genetic variation of the atpB-rbcL noncoding spacer of cpDNA in T. mongolica

In this study, we investigated the phylogeographical pattern and population structure
of endangered T. mongolica in western Inner Mongolia. In total, 38 haplotypes were detected at the cpDNA atpB-rbcL locus in T. mongolica. The level of genetic variation of cpDNA is comparable to that of other endangered
shrub plants, e.g., Dunnia sinensis (θ = 0.0022) [26], and Hygrophila pogoncalyx (θ = 0.00343) [17], but is lower compared to other endangered species, e.g., θ = 0.01018 for the cpDNA trnD-trnT spacer of Cunninghamia konishii [27], and θ = 0.01268 for the cpDNA atpB-rbcL spacer of Cycas taitungensis [28]. The twofold lower nucleotide diversity to above endangered species in T. mongolica may be ascribed to its extremely small effective population size associated with the
low seed set in wild (1.3 - 2.8%) [9].

Recent habitat loss has reduced the number and size of T. mongolica populations [10]. Small populations of narrowly distributed species are expected to exhibit low levels
of genetic variation, but high levels of genetic differentiation among populations,
which were all detected in this species (Table 4) [2]. Interestingly, different levels of genetic variation were detected in different
populations. The HN and XD populations possessed more haplotypes and higher genetic
diversity than others, whereas YKBLG population displayed genetic homogeneity (Table
1). The lack of genetic variability in some populations, e.g. SZS, MSG and TST, near
threefold lower in nucleotide diversity, was likely associated with frequently human
activities. In contrast, some populations of T. mongolica experienced relatively little disturbance due to low accessibilities [3,10].

Our previous study revealed medium levels of genetic differentiation among populations
of T. mongolica based on ISSR data [10]. In contrast, in cpDNA spacer higher genetic differentiation was detected between
populations than in ISSR fingerprinting. The difference may be highly associated with
the reproductive characteristics of the species. It has been known that gene flow
of seed plants occurs either via pollen prior to fertilization or seeds. In this study,
T. mongolica is primarily pollinated by insects [29]. Gene flow between populations via pollen would be limited by the migratory capacity
of pollinators. In addition, seed dispersal of seeds from schizocarp, a dry fruit
developing from four carpels, is constrained by gravity [29], likely resulting in most seed dispersal confined to short distances. With maternal
inheritance and haploid nature, chloroplast DNA is suitable for estimating the contribution
of seed movement to total gene flow [21], whereas, ISSRs represent nuclear DNA, mostly carried and dispersed by pollen dispersal
[30]. In this study, higher genetic differentiation between all populations in cpDNA than
in ISSR is likely ascribed to limited seed dispersal.

The BEAST skyline plot for cpDNA spacer identified a recent population decline ever
since sixteen thousand years before present likely associated with human destruction
as T. mongolica has long been used as firewood (Figure 5) [6]. Ecologically, this plant is still one of the dominant shrubs in Inner Mongolia.
Through the analysis of skyline plot, we were able to recover the history of a long
term human disturbance that caused a decline in population size of T. mongolica.

Another major factor that shaped the phylogeography and population demography is the
frequent flooding of the Yellow River, the second longest river in China [31]. The floods not only eroded river banks, but resulted in many habitats submerged,
inevitably leading to population extinction. In addition to bank erosion, the Yellow
River is well known for its heavy load of silt. Soil deposits elevate the riverbed
and cause flows between natural levees. The river may break out of the levees into
the surrounding lower flood plain and adopt a new route. Records indicate that the
events have occurred about once every century [32]. Such devastations caused dramatic changes of flora and fauna along the Yellow River.
Geological records indicate that the river's levees were breached more than 1,500
times and its course changed 26 times in the last 3,000 years [32]. Given such frequent flooding, T. mongolica would have experienced demographic fluctuations over and over. That is, severe periodical
population bottlenecks followed by subsequent demographic expansion would elevate
genetic drift effects and lead to a loss of genetic variation [33,34].

Phylogeography and conservation of T. mongolica

In this study, gene genealogy of cpDNA in T. mongolica was recovered (Figures 2 and 3). Eight cpDNA clades were identified in the NJ tree. Most of the populations contained
only one clade in the genetic composition, displaying a pattern of most genetic variation
residing between populations. Nested contingency analysis discriminating the geographical
associations of haplotypes and clades provides further insights into historical events
that shaped the phylogeography (Figure 4). At the total cladogram, restricted levels of Dc vs. a large Dn illustrates restricted gene flow with isolation by distance as the primary process
responsible for the present-day distribution of T. mongolica in Inner Mongolia. As cpDNA is maternally inherited, this inference indicates limited
seed dispersal. Besides, long distance colonization was also observed in clade 3-1,
a common phenomenon occurring over glacial maxima [25]. Furthermore, past fragmentation observed in clade 3-2 was likely associated with
the Yellow River flooding.

It is expected that endangered species that are narrowly distributed and own a small
population size would have high risks of extinction, especially when gene flow between
populations is restricted [1,35]. Another consequence of a small-sized population is the susceptibility to inbreeding,
which reduces heterozygosity and the performance of various fitness-related traits,
thereby substantially increasing the probability of extinction [36,37]. Given small sizes in the wild populations of T. mongolica, the maintenance of genetic diversity would be critical for the long-term survival
of species in considering conservation strategies [38]. Historical demographic events in a species play an important role in determining
the present-day geographic structure of intraspecific genetic variation [39]. In this study, given the high levels of population differentiation between populations
and low levels of genetic diversity within populations of T. mongolica, for retaining the existing diversity, reservation regions covering major populations
with high genetic variation should be established.

Habitat destruction and fragmentation would inevitably result in small and isolated
populations [40,41], and increase the risk of population extinction. Today, many natural habitats of
T. mongolica around the Yellow River have been destroyed or altered due to human overexploitation.
In order to develop an effective strategy to conserve species, the Evolutionarily
Significant Unit (ESU) needs to be defined. Various criteria for ESUs have been suggested,
including reciprocal monophyly [23], adaptive variation [42], and reproductive separation [43]. Recognizing ESUs as reciprocally monophyletic groups ensures that the entire evolutionary
heritage within species can be maintained and that populations belonging to different
lineages can be managed separately [44]. In T. mongolica, given high levels of genetic differentiation and reciprocal monophyly between most
populations and lower genetic variation within populations, each population should
be treated as different evolutionarily significant units for conservation (Figure
2).

From another view, conservation efforts can be targeted to genetic hot spots where
populations have high levels of genetic diversity [17,45]. Accordingly, HN and XD populations that owned higher genetic diversity near two-three
folds than other populations (Table 1) can be recognized as genetic hot spots of T. mongolica. The concept of genetic and ecological exchangeabilities is also central to the definitions
of ESU [46]. Crandall et al. [44] emphasize that the ESU concept not only includes ecological data and genetic variation,
but also considers the ecological and genetic exchangeabilities. In practice, the
status of recent and historical genetic and ecological exchangeability between populations
is considered. As the contemporary gene flow from the populations at genetic hot spots
to other populations does not exist or cannot be determined, the populations at the
genetic hot spots need to be treated as distinct conservation units for its unique
genetic variation [44]. Accordingly, hotspots at HN and XD of T. mongolica would represent different units for conservation, as seed dispersal between populations
is limited.

Conclusion

As expected, the rare species of T. mongolica possessed very low genetic variation at the cpDNA noncoding spacer. The effects of
random genetic drift would lead to the loss of genetic variability in a small-sized
population and genetic differentiation among populations, as observed in T. mongolica that has been mediated by limited seed dispersal and frequent population shrinking
due to the flooding of the Yellow River. Phylogeographical analyses suggest that the
present-day T. mongolica populations in Inner Mongolia were likely a result of restricted gene flow with isolation
by distance plus occasional long distance dispersal. Molecular data provide sufficient
information for reconstructing phylogeographical patterns and critical information
for setting conservation strategy. For conservation considerations we suggest that
populations with reciprocal monophyly at the cpDNA, and the two genetically highly
variable populations likely represent management units for protecting T. mongolica.

Methods

Population sampling

In this study, we investigated the genetic variation and structure in the rare and
endangered T. mongolica. This rare species is distributed in the western Inner Mongolia with small habitat
fragments. Samples were collected from all populations, avoiding clonal individuals.
A total of 77 individuals, representing eight extant populations, were sampled throughout
the species entire range (Figure 1, Table 1). Individuals were chosen randomly. Young healthy leaves were collected from the
field and were dried directly with silica gel. Dried leaves were preserved in silica
gel until DNA extraction.

DNA extraction and PCR

Leaves were powdered in liquid nitrogen and stored in a -70°C freezer. Genomic DNAs
were extracted from the powdered tissue following a CTAB procedure and gel-quantified
[47]. The intergenic spacer between the atpB and rbcL genes of the chloroplast DNA was amplified using a pair of universal primers [48]. Each 50 μL PCR contained: 10 ng template DNA, 5 μl 10 × PCR reaction buffer, 5 μL
MgCl2 (25 mM), 5 μl dNTP mix (8 mM), 10 pmole of each primer, and 4 U of Taq polymerase (Promega, Madison, USA). The reaction was programmed on an MJ Thermal Cycler
with first cycle of denaturation at 95°C for 2 min, then 30 cycles of denaturation
at 94°C for 45 sec, annealing at 48°C for 1 min, and extension at 72°C for 1 min 30
sec, followed by 72°C extension for 10 min and 4°C for storage.

T-vector cloning and nucleotide sequencing

All PCR products were purified from an agarose gel using the PCR product purification
kit (Viogene, Sunnyvale, USA) and cloned into a pGEM-T easy cloning vector (Promega).
For each cpDNA atpB-rbcL fragment, both strands were cycle-sequenced using the Taq Dye Deoxy Terminator Cycle Sequencing Kit (Applied Biosystem, Foster City, USA). Products
of the cycle sequencing reactions were run on an ABI 377XL automated sequencer (Applied
Biosystem). Cloned PCR products were sequenced using universal T7 forward (5'- TAATACGACTCACTATACGGG-3')
and SP6 reverse (5'-TATTTAGGTGACACTATAG-3') primers located on pGEM-T easy vector
termination sites. Five clones for each individual were randomly chosen and sequenced.

Sequence alignment and phylogenetic analysis

The chloroplast DNA sequences were aligned with the program Clustal X 1.81 [49]. Indels were excluded from the data analysis. Neighbor-joining (NJ) analysis based
on Kimura's [50] two-parameter distance was performed using the software MEGA 4.0 [51]. To evaluate clade support, 1,000 replicates of bootstrap analysis [52] were performed using fast heuristic search and TBR branch-swapping. The nested clade
analysis of Templeton et al. [53] provides a statistical framework for examining associations between the geographical
distribution of haplotypes and their genealogical relationships [54,55]. Pairwise comparisons between DNA haplotypes were calculated using MEGA 4.0 [51]. These were used to construct a minimum spanning network in a hierarchical manner
with the aid of the MINSPNET [56].

The nested clade analysis was accomplished by linking nucleotide haplotypes in a hierarchical
manner based on mutational changes. The linking rules start at the tips of the cladogram
and move one mutational step into the interior, uniting all haplotypes connected by
this procedure into a '1-step clade'. Following pruning off the initial 1-step clades
from the tips, the procedure is repeated on the more interior portions of the haplotype
tree until all haplotypes have been placed into 1-step clades. The next level of nesting
uses the 1-step clades as units, rather than individual haplotypes. The linking rules
are the same; however, '2-step clades' are now formed. The nesting procedure is repeated
until a nesting level is reached such that the next higher nesting level would result
in only a single category spanning the entire original haplotype network; via such
hierarchical linking, a nested network was drawn.

Population genetic analysis of the cpDNA

Levels of inter- and intra-population genetic diversity based on cpDNA were quantified
by indices of haplotype diversity (h) [57] and estimates of nucleotide difference (θ) [58] using DnaSP (Version 5) [59]. Patterns of geographical subdivision and gene flow were also estimated hierarchically
with the aid of DnaSP. Gene flow within and among regions or populations was approximated
as Nm, the number of female migrants per generation between populations. Nm was estimated using the expression FST = 1/(1 + 2 Nm), where N is the female effective population size and m is the female migration rate [60].

A model of "isolation by distance" was assessed by plotting pairwise Nm values against geographical distance [60]. The correlation between Nm and distance was determined by a regression of F-test over distances using SPSS program
version 6.0 [61]. Geographical associations of haplotypes and clades within the minimum spanning network
were tested using the program GeoDis [62]. Two statistics were calculated: 1) the clade distance, Dc, a measure of the geographical spread of a clade, and 2) the nested clade distance,
Dn, a measure of the geographical distribution of a clade relative to other clades in
the same higher level nesting category. These measures of geographical distribution
were used to infer historical processes following the methods of Templeton et al.
[53]. The nested clade analysis (NCA) has been recently challenged with serious flaws,
such as ignorance of the type I error [63]. The debates [64-66] have been lasting. In this study, we used this analysis to generate a phylogeography
hypothesis.

To estimate the population demographic trends for T. mongolica, we analyzed the cpDNA noncoding spacer data with BEAST v1.4 [67]. Bayesian skyline plot model was used to infer past demographic dynamics through
time, which uses standard Markov chain Monte Carlo (MCMC) sampling procedures calculated
from a sample of molecular sequences estimate a posterior distribution of effective
population size without dependence on a pre-specified parametric model of demographic
history [68]. In the study, an evolutionary rate for the chloroplast atpB-rbcL spacer of 3 × 10-9 substitutions per site per year was used [69,70]. The Bayesian skyline plot includes credibility intervals for the estimated effective
population size at every point in time, back to the most recent common ancestor of
the gene sequences. The credibility intervals represent both phylogenetic and coalescent
uncertainty.

Authors' contributions

TYC designed and conceived this study. XJG collected the samples and analyzed the
data. WKW, ZHL, CCH, KHH and WHH collected the genetic data. WKW and TYC analyzed
the data and wrote the manuscript. All authors read and approved the final manuscript.

Acknowledgements

This study was supported by the National Science Council of Taiwan, and the CAS Important
direction for knowledge innovation project of Chinese Academy of Sciences (No.KZCX2-EW-305).