Copyright This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.

Abstract

The lion Panthera leo is one of the world's most charismatic carnivores and is one of Africa's key predators. Here, we used a large dataset from 357 lions comprehending 1.13 megabases of sequence data and genotypes from 22 microsatellite loci to characterize its recent evolutionary history. Patterns of molecular genetic variation in multiple maternal (mtDNA), paternal (Y-chromosome), and biparental nuclear (nDNA) genetic markers were compared with patterns of sequence and subtype variation of the lion feline immunodeficiency virus (FIVPle), a lentivirus analogous to human immunodeficiency virus (HIV). In spite of the ability of lions to disperse long distances, patterns of lion genetic diversity suggest substantial population subdivision (mtDNA ΦST = 0.92; nDNA FST = 0.18), and reduced gene flow, which, along with large differences in sero-prevalence of six distinct FIVPle subtypes among lion populations, refute the hypothesis that African lions consist of a single panmictic population. Our results suggest that extant lion populations derive from several Pleistocene refugia in East and Southern Africa (∼324,000–169,000 years ago), which expanded during the Late Pleistocene (∼100,000 years ago) into Central and North Africa and into Asia. During the Pleistocene/Holocene transition (∼14,000–7,000 years), another expansion occurred from southern refugia northwards towards East Africa, causing population interbreeding. In particular, lion and FIVPle variation affirms that the large, well-studied lion population occupying the greater Serengeti Ecosystem is derived from three distinct populations that admixed recently.

Author Summary

The lion Panthera leo, a formidable carnivore with a complex cooperative social system, has fascinated humanity since pre-historical times, inspiring hundreds of religious and cultural allusions. Here, we use a comprehensive sample of 357 individuals from most of the major lion populations in Africa and Asia. We assayed appropriately informative autosomal, Y-chromosome, and mitochondrial genetic markers, and assessed the prevalence and genetic variation of the lion-specific feline immunodeficiency virus (FIVPle), a lentivirus analogous to human immunodeficiency virus (HIV) that causes AIDS-like immunodeficiency disease in domestic cats. We compare the large multigenic dataset from lions with patterns of genetic variation of the FIVPle to characterize the population-genomic legacy of lions. We refute the hypothesis that African lions consist of a single panmictic population, highlighting the importance of preserving populations in decline rather than prioritizing larger-scale conservation efforts. Interestingly, lion and FIVPle variation revealed evidence of unsuspected genetic diversity even in the well-studied lion population of the Serengeti Ecosystem, which consists of recently admixed animals derived from three distinct genetic groups.

Introduction

Lion fossils trace to the Late Pliocene in Eastern Africa and the Early Pleistocene in Eastern and Southern Africa coincident with the flourishing of grasslands ∼2–1.5 million years ago [1],[2]. By Mid Pleistocene (∼500,000 years ago), lions occupied Europe and by the Late Pleistocene (∼130,000–10,000 years ago) lions had the greatest intercontinental distribution for a large land mammal (excluding man), ranging from Africa into Eurasia and the Americas [3]. Lions were extirpated from Europe 2,000 years ago and within the last 150 years from the Middle East and North Africa. Today, there are less than 50,000 free-ranging lions [4] that occur only in sub-Saharan Africa and the Gir Forest, India (Figure 1A).

Geographic location of the lion samples and the variability of host and viral genetic markers among lion populations.

Understanding the broader aspects of lion evolutionary history has been hindered by a lack of comprehensive sampling and appropriately informative genetic markers [5]–[9], which in species of modern felids requires large, multigenic data sets due to its generally rapid and very recent speciation [10],[11]. Nevertheless, the unique social ecology of lions [12]–[14] and the fact that lions have experienced well-documented infectious disease outbreaks, including canine distemper virus, feline parvovirus, calicivirus, coronavirus, and lion feline immunodeficiency virus (FIVPle) [15]–[18] provide a good opportunity to study lion evolutionary history using both host and virus genetic information. Indeed, population genetics of transmitted pathogens can accurately reflect the demographic history of their hosts [19],[20]. Unlike other of the 36 cat species, lions have a cooperative social system (prides of 2–18 adult females and 1–9 males) and their populations can have high frequencies of FIVPle, a lentivirus analogous to human immunodeficiency virus (HIV), which causes AIDS-like immunodeficiency disease in domestic cats. FIVPle is a retrovirus that integrates into the host genome and is transmitted by cell-to-cell contact, which in felids occurs during mating, fighting and mother-to-offspring interactions. Thus, viral dissemination is a function in part of the frequency of contact between infected and naïve lions within and among populations. The virus is quite genetically diverse in lions [15],[18], offering a unique marker for assessing ongoing lion demographic processes.

To unravel lion population demographic history we used a large multigenic dataset. Distinct sets of markers may not necessarily yield similar inferences of population history, as coalescent times vary as a function of their pattern of inheritance [21]. There is also a large variance in coalescent times across loci sharing a common pattern of inheritance especially in complex demographical histories (Table 1). However, the accurate interpretation of the differences among loci can provide a more resolved and coherent population history, affording more-nuanced insights on past demographic processes, levels of admixture, taxonomic issues, and on the most appropriate steps for effective conservation and management of remaining populations.

Expected and observed coalescent times for the different markers studied in lions according to their pattern of inheritance.

The goal of this study was to assess the evolutionary history of lion by (1) characterizing lion population structure relative to patterns of FIVPle genetic variation, (2) detect signatures of migration using both host and viral population genomics, and (3) reconstruct lion demographic history and discuss its implication for lion conservation. We assess genetic variation from 357 lions from most of its current distribution, including mitochondrial (mtDNA; 12S–16S, 1,882 bp), nuclear (nDNA) Y-chromosome (SRY, 1,322 bp) and autosomal (ADA, 427 bp; TF, 169 bp) sequences, and 22 microsatellites markers. We further document patterns of FIVPle variation in lions (FIVPlepol-RT gene, up to 520 bp).

Results/Discussion

Population Structure of Lion

Genetic analyses of 357 lions from throughout the extant species range showed that paternally inherited nDNA (SRY) and maternal inherited (mtDNA) sequence variation was generally low (only one paternal SRY-haplotype and 12 mtDNA haplotypes; π = 0.0066) (Figure 1; Figure S1; Tables S1 and S2). The most common mtDNA haplotype H11 was ubiquitous in Uganda/Tanzania and parts of Botswana/South Africa, H1 was common in Southern Africa, and H7 and H8 were unique to Asian lions. The autosomal nDNA sequences showed fairly distinct patterns of variation (Figure 1; Figure S1). Of the five ADA haplotypes, A2 was the most common and most-widely distributed. The other four haplotypes, which are derived and much less common, included one (A5) that was fixed in Asian lions. The three TF haplotypes were more widely and evenly distributed.

Levels of population subdivision among lions were assessed using microsatellite and sequencing data. Eleven groups were identified using Bayesian analyses [22] and three-dimensional factorial correspondence analyses [23] (Figure 2; Table S3). Most clusters represented geographically circumscribed populations: Namibia (Nam), Kruger National Park (Kru), Ngorongoro Crater (Ngc), Kenya (Ken), Uganda (Uga), and Gir (Gir). Two distinct clusters were found in Botswana, Bot-I that included lions from southern Botswana and Kalahari (South Africa) (Fk = 0.24) and Bot-II found exclusively in northern Botswana (Fk = 0.18). Surprisingly, three distinct clusters were found in a single geographical locale (approximately 60×40 km square) in the large panmyctic population of the Serengeti National Park (Ser-I/Ser-II/Ser-III) (Fk = 0.18, 0.21, and 0.15, respectively).

Two captive lions from Angola (Ang), one from Zimbabwe (Zbw) and four Morocco Zoo Atlas lions (Atl; presently extinct from the wild) (Figure 1A) were included in the analyses to explore the relationship of lions from more isolated, endangered, or depleted areas. Ang and Zbw lions were assigned to Bot-II (q = 0.90 and 0.87; 90%CI: 0.47–1.00) and Kru (q = 0.85; 90%CI: 0.52–1.00) (Bayesian analyses [22]) populations, respectively, as expected based on their geographical proximity. However, these lions differed from Bot-II and Kru by up to 8 mtDNA mutations, sharing haplotypes with the Atl lions (H5 in Ang and H6 in Zbw) (Figure 1B and 1C). The Atl lions did not group in a unique cluster.

Both nDNA and mtDNA pairwise genetic distances among the 11 lion populations showed a significant relationship with geographic distance (R2 = 0.75; Mantel's test, P = 0.0097; and R2 = 0.15; Mantel's test, P = 0.0369; respectively) (Figure 3). The significant positive and monotonic correlation across all the scatterplot pairwise comparisons for the nDNA markers (bi-parental) was consistent with isolation-by-distance across the sampled region. However, the correlation between nDNA FST and geographic distance considerably decreased when the Asian Gir population was removed (R2 = 0.19; Mantel's test, P = 0.0065) suggesting that caution should be taken in interpreting the pattern of isolation-by-distance in lions. We further compared linearized FST estimates [24] plotted both against the geographic distance (model assuming habitat to be arrayed in an infinite one-dimensional lattice) and the log geographic distance (model assuming an infinite two-dimensional lattice). The broad distribution of lions might suggest a priori that a two-dimensional isolation-by-distance model would provide the best fit for the nDNA data (R2 = 0.25; Mantel's test, P = 0.0022), but instead the one-dimensional isolation-by-distance model performed better (R2 = 0.71; Mantel's test, P = 0.0476) (Figure S2).

The pattern observed for the mtDNA (maternal) was more complex. While there was a significant relationship between mtDNA FST and geographic distance, there was an inconsistent pattern across broader geographic distances (Figure 3). This is partly due to the fixation or near fixation of haplotype H11 in six populations and the fixation of a very divergent haplotype H4 in Ken population (Figure 1B and 1C). The removal of the Ken population considerably increased the correlation between mtDNA FST and geographic distance (R2 = 0.27; Mantel's test, P = 0.0035). Thus, the null hypothesis of regional equilibrium for mtDNA across the entire sampled region is rejected despite the possibility that isolation-by-distance may occur regionally.

These contrasting nDNA and mtDNA results may be indicative of differences in dispersal patterns between males and females, which would be consistent with evidence that females are more phylopatric than males. Alternatively, selection for matrilineally transmitted traits upon which neutral mtDNA alleles hitchhike is possible, given the low values of nucleotide diversity of the mtDNA (π = 0.0066). A similar process has been suggested in whales (π = 0.0007) [25] and African savannah elephants (π = 0.0200) [26], where both species have female phylopatry and like lions, a matriarchal social structure. However, genetic drift tends to overwhelm selection in small isolated populations, predominantly affecting haploid elements due to its lower effective population size (Table 1). Therefore, we suggest that the contrasting results obtained for nDNA and mtDNA are more likely further evidence that lion populations underwent severe bottlenecks. The highly structured lion matrilines comprise four monophyletic mtDNA haplo-groups (Figure 4A; Figure S3). Lineage I consisted of a divergent haplotype H4 from Ken, lineage II was observed in most Southern Africa populations, lineage III was widely distributed from Central and Northern Africa to Asia, and lineage IV occurred in Southern and Eastern Africa.

Evolutionary relationships of the host and viral genetic markers among lion populations.

Population Structure of FIVPle

Seroprevalence studies indicate that FIVPle is endemic in eight of the 11 populations but absent from the Asian Gir lions in India and in Namibia and southern Botswana/Kalahari regions (Nam/Bot-I) in Southwest Africa (Figure 1B). Phylogenetic analysis of the conserved pol-RT region in 117 FIV-infected lions depicted monophyletic lineages [15],[18] that affirm six distinct subtypes (A–F) that are distributed geographically in three distinct patterns (Figure 4B; Figure S4). First, multiple subtypes may circulate within the same population as exemplified by subtypes A, B and C all ubiquitous within the three Serengeti lion populations (Ser-I, Ser-II and Ser-III) and subtypes A and E within lions of Botswana (Bot-II) (Figure 4B and 4C and Figure S4). Second, unique FIVPle subtypes may be restricted to one location as subtype F in Kenya, subtype E in Botswana (Bot-II), subtype C in Serengeti, and subtype D in Krugar Park (Figure 4B and 4C and Figure S4). Third, intra-subtype strains cluster geographically, as shown by distinct clades within subtype A that were restricted to lions within Krugar Park, Botswana and Serengeti and within subtype B that corresponded to Uganda, Serengeti and Ngorongoro Crater lions (Figure 4B and 4C and Figure S4).

Not unexpectedly, FIVPle pairwise genetic distances, represented as population FST among the eight lion FIV-infected populations, were not significantly correlated with geographic distance (R2 = 0.08; Mantel's test, P = 0.165) (Figure 3), and affirms that patterns of viral dissemination do not conform to a strict isolation-by-distance model. Rather, the two distinct clusters observed (Figure 3) reflect the complex distribution of FIVPle among African lions. Indeed, despite the low geographic distance within East-African lion populations, the FIVPle genetic divergence showed a broader range in FST (0.03 to 0.79 for most of first cluster; Figure 3). By contrast, approximately half of the range in FST (0.26 to 0.69 for the second cluster; Figure 3) was observed among East and Southern Africa in spite of its large geographic separation. In contrast with the patterns observed in lions, linearized FST estimates [24] for FIVPle were better correlated with log geographic distance (two-dimensional lattice model) (R2 = 0.15) than with geographic distance (one-dimensional model) (R2 = 0.02), although in both cases the Mantel's test was not significant (P>0.2474) (Figure S2).

Natural History of Lions as Inferred from Lion and FIVPle Markers

The mtDNA coalescence dating suggested that the East African lineage I (Ken haplotype H4) had an old origin of ∼324,000 years (95% CI: 145,000–502,000). Extant East African populations (Ken/Ngc/Ser-I/Ser-II/Ser-III) also showed a slightly significant higher nDNA allelic richness and genetic diversity (Table S4) relative to populations to the south (Kru/Nam/Bot-I/Bot-II) and north (Uga/Gir) (A = 2.43, 2.39, and 1.62, P = 0.021; HO = 0.64, 0.62, and 0.34, P = 0.019; respectively). Moreover, the FIVPle subtype diversity was higher in East African clades (exhibiting four out of the six known viral-strains), including the most divergent FIVPle subtype C (Figure 4B and 4C). These genetic data from lions and FIVPle is consistent with the older origin of extant East African lions, which is further supported by the oldest lion fossils discovered in East Africa [1].

Relative to East Africa, Southern lions have a slightly more recent mtDNA coalescence. Lineage II, found in Nam, Bot-II and Kru has an estimated coalescence of 169,000 years (95% CI: 34,000–304,000) and the more widespread lineage IV found in the Southern populations of Bot-I, Bot-II and Kru as well as the Eastern populations of Ser (I, II, and III), Ngc and Uga, coalesces ∼101,000 years ago (95% CI: 11,000–191,000). However, the similar levels of nDNA genetic diversity, the occurrence of an exclusively Southern mtDNA lineage II and highly divergent FIVPle subtypes, FIVPle subtype D found only in Kru and subtype E exclusive to Bot-II, suggests that both East and Southern Africa were important refugia for lions during the Pleistocene. Therefore, the co-occurrence of divergent mtDNA haplotypes (6 to 10 mutations; Figure 1B and 1C) in southern populations may be the consequence of further isolation within refugia during colder climatic periods. Contemporary fragmentation of lion populations could further explain the results of nested-clade phylogeographical analysis (NCPA [27]) (Figure S5), which inferred restricted gene flow with isolation-by-distance between mtDNA haplotypes H9 (Bot-II) and H10 (Kru) (χ2 = 10.00, P = 0.0200), between haplotypes H1 (Bot-II/Nam) and H2 (Kru) (χ2 = 71.00, P≤0.0001), and between haplotypes H9–H10 (Bot-II/Kru) and haplotypes H11–H12 (Bot-I/Kru/Ser/Ngc/Uga) (χ2 = 187.83, P≤0.0001).

Further isolation within refugia (sub-refugia) may also have occurred in East Africa. This is suggested by the distinctive mtDNA haplotype H4 and the unique FIVPle subtype F found in the Kenya population, which may have resulted from reduced gene flow across the Rift valley, a scenario that has been suggested for several bovid and carnivore populations (see [28] and references therein).

The best example of concordance between host genome markers and viral transmission patterns is observed in the Serengeti National Park in Tanzania. Our previous findings described markedly high levels of FIVPle subtype A, B and C circulating within the Serengeti lion population to such an extent that 43% of the lions sampled were multiply-infected with two or three subtypes [15],[18] and were hypothesized to represent recent admixture of three formerly separated populations. Such result is confirmed here by lion genomic markers (Figure 2). Further, although lions within the Serengeti can be assigned to one of three populations (Ser-I, Ser-II or Ser-III) by host genomic markers, FIVPle subtypes are distributed ubiquitously in all three, characteristic of rapid horizontal retroviral transmission subsequent to host population admixture. The possible isolating mechanism remains to be elucidated as there is no apparent barrier to gene flow in this ecosystem.

Genomic Signatures Left by Migration

Based on patterns of genetic diversity and phylogenetic analysis of lion nDNA/mtDNA and FIVPle markers, we propose a scenario of a period of refugia/isolation in the Late Pleistocene followed by two major lion expansions across Africa and Asia. The first expansion, supported by the mtDNA NCPA [27] (χ2 = 690.00, P≤0.0001; Figure S5), was a long-distance colonization of mtDNA lineage-III (Gir/Atl/Ang/Zbw) around 118,000 years ago (95% CI: 28,000–208,000), with subsequent fragmentation of haplotypes H5–H6 into Central and North Africa and haplotypes H7–H8 into West Asia (M1- Figure 1A). Support for this initial expansion is also found in nDNA. The ADA haplotype A5 fixed in Gir in also present in Ken, Ser-II, and Ser-III, suggesting that lions likely colonized West Asia from the East Africa refugia (Figure 1B). Such an expansion may have been favored by the start of a warmer and less arid period in Africa 130,000–70,000 years ago [29]. This “out-of-Africa event” would have occurred much later than the initial lion expansion through Eurasia based on fossils (∼500,000 years ago) [3]. It is likely that multiple lion expansions occurred in the Pleistocene, as occurred with humans [21].

A second, more recent lion expansion probably occurred at the Pleistocene/Holocene transition, this one from Southern Africa toward East Africa (M2- Figure 1A, Figure 3). This is reflected in the mtDNA linage IV, where haplotypes present in Southern lions are basal (older) to those found in the East. Overall, mtDNA population nucleotide diversity decreases from Southern to East Africa (Figure 1B and 1C), a finding supported by pairwise mismatch analysis [30] (raggedness, r = 0.086; P<0.001). The fixation of mtDNA haplotype H11 in Bot-I (otherwise fixed only in East Africa populations) suggests that the colonizing lions expanded northwards from the Kalahari Desert, which included bush, woodland and savannah habitats during the climatic fluctuations of the Pleistocene [31]. This expansion would have occurred relatively recently as the single rare tip mtDNA haplotype H12, found only in Ser-I, is derived from the interior widespread haplotype H11 (∼14,000–7,000 years; given one mtDNA substitution every 7,000 years; Table 1). This expansion is also supported by FIVPle subtype A where haplotypes present in Southern lions (Kru and Bot-I) are basal to those found in the East (Ser-I, Ser-II and Ser-III) and a decrease of nucleotide-diversity of this FIVPle subtype is observed from Southern (π = 0.15) to Eastern Africa (π = 0.03) (Figure 3B). Interestingly, a similar northward colonization process from Southern Africa has been suggested for some of the lion preys, namely the impala, greater kudu, and wildebeest [32],[33].

Utility of Population Genomic Datasets

If we had restricted our inferences to mtDNA, we might have concluded that East African lion populations, which are fixed or nearly fixed for haplotype H11, went extinct during the Pleistocene/Holocene transition (similar to the well known mega-fauna extinctions of the Late Pleistocene [34]) and were then colonized by Southern populations. However, our population genomics data better fit a scenario of lion population expansion and interbreeding rather than simple replacement. First, genetic diversity and allelic richness at nDNA are slightly higher in East Africa populations relatively to those in Southern Africa. This is contrary to the expected pattern of population expansion in which there is usually a progressive decline in genetic diversity and allelic richness. Second, Ser lions carry two diverse FIVPle subtypes found only in East Africa (B–C), and not only FIVPle subtype A, which was presumably introduced in East Africa coincidently with the mtDNA expansion event northwards from South. Third, the East African FIVPle subtype B found in Uga/Ser-I/Ser-II/Ser-III/Ngc showed evidence of a population expansion (raggedness, r = 0.004; P<0.01; Fs = −20.37; P<0.00001) and the highest nucleotide diversity observed within FIVPle subtypes (π = 0.09). Four, the FIVPle subtype diversity is higher in East African clades (four out of the six viral strains).

The utility of FIVPlepol-RT as a marker of lion population structure and natural history is that it can be informative on a contemporaneous time scale, though it may be less useful at capturing more ancient demographic events. The extreme divergence among FIVPle subtypes, considered with high sero-prevalence in eight of the 11 lion populations, and combined with patterns of geographic concordance, support the hypothesis that FIVPle is not a recent emergence within modern lions [35]. Populations that harbor one private FIVPle subtype such Ken (subtype F), Bot-II (subtype E), and Kru (subtype D) must have been sufficiently isolated for enough time for the virus to evolve into unique subtypes, a result corroborated by the high nDNA and mtDNA genetic structure present in these lion populations (Figure 4). Thus, it is possible that the initial emergence of FIVPle pre-dates the Late-Pleistocene expansions of contemporary lion populations [36], but present day distributions are more useful indicators of very recent host population dynamics, a result also observed with FIVPco in a panmictic population of pumas in western North America [19].

Conservation Implications

Accurate interpretation of past and contemporary population demographic scenarios is a primary goal for the effective conservation of endangered species. In this study, we found substantial population subdivision, reduced gene flow, and large differences in FIVPle sequence and sero-prevalence among lion populations, as well as evidence of historic secondary contact between populations (Figure 3C; Table S4 to S9). The very low population level of mtDNA nucleotide diversity, the number of haplotypes private to a single population (Figure 1), and probably also the lack of SRY genetic variation across all male lions (haplotype S1, n = 183) suggests that lion numbers diminished considerably following the Late Pleistocene. The last century reduction in lion distribution further eroded its genomic diversity, and microsatellite variation suggested recent population bottlenecks in seven out of the 11 populations (standardized differences test, P<0.05; Table S5) [37].

Although we did not explicitly try to address the adequacy of lion subspecies designations (currently only one African subspecies is widely recognized) [38],[39], we provided strong evidence that there is no evidence of substantial genetic exchange of matrilines among existing populations as the AMOVA [40] within-population component was uniformly high in all distinct subdivision scenarios (ΦST≈0.920; P<0.0001; three-six groups; Table S6). Similarly, significant population structure was detected from nDNA (FST = 0.18), with low levels of admixture evident from Bayesian analysis [22] (α = 0.033). Therefore, employing a bottom-up perspective that prioritizes populations, rather than large-scale units (e.g. all African lions), might preserve and maintain lion diversity and evolutionary processes most efficiently [41].

Sero-Prevalence and Molecular Genetics of FIVPle

Western blots using domestic cat and lion FIV as antigen were performed as previously described [46],[47]. The supernatant from virus-infected cells was centrifuged at 200 g for 10 min at 5°C. The resultant supernatant was centrifuged at 150,000 g at 4°C for 2 hours. Pelleted viral proteins were resuspended in 1/20th of the original volume and total protein content was assayed using the Biorad Protein Assay. Twenty mg of viral protein were run on 4–20% Tris-Glycine gels and transferred to PDVF membranes (BioRad). Membrane strips were exposed 2–12 h to a 1∶25 or 1∶200 dilution of serum or plasma. After washing, samples were labeled with goat anti-cat HRP or phosphate conjugated antibody (KPL laboratories) at a 1∶2000 dilution, washed, and incubated in ECL Western Blotting detection reagents (Amersham Biosciences) for 2 min, then exposed to Lumi-Film Chemiluminescent Detection Film (Boehringer Mannheim) or incubated in BCIP/NBP phosphatase substrate (KPL laboratories) for 15 min [46]–[48]. Results were visualized and scored manually based on the presence and intensity of antibody binding to the p24 gag capsid protein.

Statistical Analyses

We used the Genetix 4.02 [49], Genepop 3.3 [50], Microsat[51], and DnaSP 4.10 [52] to calculate the following descriptive statistics: (i) percentage of polymorphic loci (P95), number of alleles per locus (A), observed and expected heterozygosity (HE and HO), and number of unique alleles (AU); (ii) assess deviations from HWE; (iii) estimate the coefficient of differentiation (FST), and (iv) nucleotide (π) and haplotype (h) diversity.

We tested the hypothesis that all loci are evolving under neutrality for both the lion and the FIVPle data. For frequency data, we used the method described by Beaumont and Nichols [53] and implemented in Fdist (http://www.rubic.rdg.ac.uk/˜mab/software.html). The FST values estimated from microsatellite loci plotted against heterozygosity showed that all values fall within the expected 95% confidence limit and consequently no outlier locus were identified. For sequence data (lion nDNA/mtDNA and FIVPlepol-RT), we ruled out any significant evidence for genetic hitchhiking and background selection by assessing Fu and Li's D* and F* tests [54] and Fu's FS statistics [55].

A Bayesian clustering method implemented in the program Structure[22] was used to infer number of populations and assign individual lions to populations based on multilocus genotype (microsatellites) and sequence data (ADA, TF, and mtDNA genes) and without incorporating sample origin. For haploid mtDNA data, each observed haplotype was coded with a unique integer (e.g. 100, 110) for the first allele and missing data for the second (Structure[22] analyses with or without the mtDNA data were essentially identical). For K population clusters, the program estimates the probability of the data, Pr(X|K), and the probability of individual membership in each cluster using a Markov chain Monte Carlo method under the assumption of Hardy-Weinberg equilibrium (HWE) within each cluster. Initial testing of the HWE in each of the populations defined by the geographic origin of sampling revealed no significant deviation from HW expectations with the exception of Ser and Bot population (later subdivided by Structure[22] in 3 and 2 clusters, respectively; such deviations from HW expectations were interpreted as evidence of further population structuring). We conducted six independent runs with K = 1–20 to guide an empirical estimate of the number of identifiable populations, assuming an admixture model with correlated allele frequencies and with burn-in and replication values set at 30,000 and 106, respectively. Structure also estimates allele frequencies of a hypothetical ancestral population and an alpha value that measures admixed individuals in the data set. The assignment of admixed individuals to populations using Structure[22] has been considered in subsequent population analyses. For each population cluster k, the program estimates Fk, a quantity analogous to Wright's FST, but describing the degree of genetic differentiation of population k from the ancestral population.

Patterns of gene flow and divergence among populations were described using a variety of tests. First, to visualize subtle relationships among individual autosomal genotypes, three-dimensional factorial correspondence analyses [23] (FCA) were performed in Genetix[49], which graphically projects the individuals on the factor space defined by the similarity of their allelic states. Second, neighbor-joining (NJ) analyses implementing the Cavalli-Sforza & Edwards' chord genetic distance [56] (DCE) were estimated in Phylip 3.6 [57], and the tree topology support was assessed by 100 bootstraps. Third, the difference in average HO and A was compared among population groups using a two-sided test in Fstat 2.9.3.2 [58], which allows to assess the significance of the statistic OSx using 1,000 randomizations. Four, the equilibrium between drift and gene flow was tested using a regression of pairwise FST on geographic distance matrix among all populations for host nDNA(microsatellites)/mtDNA and FIVPle data. A Mantel test [59] was used to estimate the 95% upper probability for each matrix correlation. Assuming a stepping stone model of migration where gene flow is more likely between adjacent populations, one can reject the null hypothesis that populations in a region are at equilibrium if (1) there is a non-significant association between genetic and geographic distances, and/or (2) a scatterplot of the genetic and geographic distances fails to reveal a positive and monotonic relationship over all distance values of a region [60]. We also evaluated linearized FST [i.e. FST/(1−FST)] [24] among populations. We tested two competing models of isolation-by-distance, one assuming the habitat to be arrayed in an infinite one-dimensional lattice and another assuming an infinite two-dimensional lattice. Both models showed that genetic differentiation increased with raw and log-transformed Euclidean distances, respectively [24]. We determined the confidence interval value of the slope of the regression for the nDNA data using a non parametric ABC bootstrap [61] in Genepop 4.0 [62].

The demographic history of populations was compared using a variety of estimators based on the coalescence theory. First, signatures of old demographic population expansion were investigated for mtDNA and FIVPlepol-RT haplotypes using pairwise mismatch distributions [63] in DnaSP [52]. The goodness-of-fit of the observed data to a simulated model of expansion was tested with the raggedness (r) index [64].

Second, the occurrence of recent bottlenecks was evaluated for microsatellite data using the method of Cornuet & Luikart [37] in Bottleneck[65] and using 10,000 iterations. This approach, which exploits the fact that rare alleles are generally lost first through genetic drift after reduction in population size, employs the standardize differences test, which is the most appropriate and powerful when using 20 or more polymorphic loci [37]. Tests were carried out using the stepwise mutation model (SMM), which is a conservative mutation model for the detection of bottleneck signatures with microsatellites [66].

Third, to discriminate between recurrent gene flow and historical events we used the nested-clade phylogeographical analysis [27],[67] (NCPA) for the mtDNA data. When the null-hypothesis of no correlation between genealogy and geography is rejected, biological inferences are drawn using a priori criteria. The NCPA started with the estimation of a 95% statistical parsimony [68] mtDNA network in Tcs 1.20 [69]. Tree ambiguities were further resolved using a coalescence criteria [70]. The network was converted into a series of nested branches (clades) [71],[72], which were then tested against their geographical locations through a permutational contingency analysis in GeoDis 2.2 [73]. The inferences obtained were also corroborated with the automated implementation of the NCPA in ANeCA [74]. To address potential weaknesses in some aspects of the NCPA analysis [75],[76], we further validated the NCPA inferences with independent methods for detecting restricted gene-flow/isolation-by-distance (using matrix correlation of pairwise FST and geographic distance) and population expansion (using pairwise mismatch distributions).

Four, to test the significance of the total mtDNA genetic variance, we conducted hierarchical analyses of molecular variance [40] (AMOVA) using Arlequin 2.0 [77]. Total genetic variation was partitioned to compare the amount of difference among population groups, among populations within each groups, and within populations.

Phylogenetic relationships among mtDNA and FIVPlepol-RT sequences were assessed using Minimum evolution (ME), Maximum parsimony (MP), and Maximum likelihood (ML) approaches implemented in Paup[78]. The ME analysis for mtDNA consisted of NJ trees constructed from Kimura two-parameter distances followed by a branch-swapping procedure and for FIVPle data employed the same parameter estimates as were used in the ML analysis. The MP analysis was conducted using a heuristic search, with random additions of taxa and tree-bisection-reconnection branch swapping. The ML analysis was done after selecting the best evolutionary model fitting the data using Modeltest 3.7 [79]. Tree topologies reliability was assessed by 100 bootstraps. For the FIVPle data, the reliability of the tree topology was further assessed through additional analyses using 520 bp of FIVPlepol-RT sequences in a representative subset of individuals.

The time to the most recent common ancestor (TMRCA) for the ADA and TF haplotypes was estimated following Takahata et al. [80], where we calculate the ratio of the average nucleotide differences within the lion sample to one-half the average nucleotide difference between leopards (P. pardus) and lions and multiplying the ratio by an estimate of the divergence time between lions and leopards (2 million years based on undisputed lion fossils in Africa) [81],[82]. The mtDNA TMRCA was estimated with a linearized tree method in Lintree[83] and using the equation H = 2μT, where H was the branch height (correlated to the average pairwise distance among haplotypes), μ the substitution rate, and T the divergence time. Leopard and snow leopard (P. uncia) sequences were used as outgroups. Inference of the TMRCA for microsatellite loci followed Driscoll et al. [6] where the estimate of microsatellite variance in average allele repeat-size was used as a surrogate for evolutionary time based on the rate of allele range reconstitution subsequent to a severe founder effect. Microsatellite allele variance has been shown to be a reliable estimator for microsatellite evolution and demographic inference in felid species [6].

Supporting Information

Figure S1

Genetic variation of 12S–16S (mtDNA) and ADA and TF (nDNA) genes in lions. (A) Haplotypes and variable sites for the 12S–16S mtDNA region surveyed in lions (total length 1,882 bp). Position 1 corresponds to position 1441 of the domestic cat (Felis catus) mtDNA genome (GenBank {"type":"entrez-nucleotide","attrs":{"text":"U20753","term_id":"1098523","term_text":"U20753"}}U20753). The “-” represents a gap and “.” matches the nucleotide in the first sequence. Shading indicates a fixed difference in the mtDNA lineage. (B) Haplotypes and variable sites for the ADA gene segment surveyed in lions (total length 427 bp). (C) Haplotypes and variable sites for the TF gene segment surveyed in lions (total length 427 bp).

Figure S4

Phylogenetic relationships of the FIVPlepol-RT sequences. Neighbour-joining tree of the 301 bp FIVPlepol-RT sequences. The distinct FIVPle subtypes were labelled A to F. Bootstrap (BPS) values are placed at each branchpoint (ME/MP/ML) and in parenthesis are the BPS values obtained for a tree established with 520 bp of FIVPlepol-RT sequence for a representative subset of individuals.

Table S9

Acknowledgments

Tissues were obtained in full compliance with specific Federal Fish and Wildlife Permits. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. We would like to thank A. Beja-Pereira, M. Branco and J. Martenson for suggestions and technical assistance. Comments made by the Associate Editor A. Estoup and three anonymous referees improved a previous version of this manuscript.

Footnotes

The authors have declared that no competing interests exist.

This research has been funded in part by federal funds from the National Cancer Institute, National Institutes of Health (N01-CO-12400), by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research, by the National Science Foundation (No. 0343960) and by the Portuguese Foundation for Science and Technology (PTDC/BIA-BDE/69144/2006). AA received a grant from the Portuguese Foundation for Science and Technology (SFRH/BPD/5700/2001).

61. Leblois R, Estoup A, Rousset F. Influence of mutational and sampling factors on the estimation of demographic parameters in a “continuous” population under isolation by distance. Mol Biol Evol. 2003;20:491–502.[PubMed]