Abstract

Understanding how speciation relates to ecological divergence has long fascinated biologists. It is assumed that ecological divergence is essential to sympatric speciation, as a mechanism to avoid competition and eventually lead to reproductive isolation, while divergence in allopatry is not necessarily associated with niche differentiation. The impact of the spatial context of divergence on the evolutionary rates of abiotic dimensions of the ecological niche has rarely been explored for an entire clade. Here, we compare the magnitude of climatic niche shifts between sympatric versus allopatric divergence of lineages in butterflies. By combining next-generation sequencing, parametric biogeography and ecological niche analyses applied to a genus-wide phylogeny of Palaearctic Pyrgus butterflies, we compare evolutionary rates along eight climatic dimensions across sister lineages that diverged in large-scale sympatry versus allopatry. In order to examine the possible effects of the spatial scale at which sympatry is defined, we considered three sets of biogeographic assignments, ranging from narrow to broad definition. Our findings suggest higher rates of niche evolution along all climatic dimensions for sister lineages that diverge in sympatry, when using a narrow delineation of biogeographic areas. This result contrasts with significantly lower rates of climatic niche evolution found in cases of allopatric speciation, despite the biogeographic regions defined here being characterized by significantly different climates. Higher rates in allopatry are retrieved when biogeographic areas are too widely defined—in such a case allopatric events may be recorded as sympatric. Our results reveal the macro-evolutionary significance of abiotic niche differentiation involved in speciation processes within biogeographic regions, and illustrate the importance of the spatial scale chosen to define areas when applying parametric biogeographic analyses.

1. Introduction

How speciation mechanisms shape biodiversity on Earth is a long-standing question in biology [1–6]. Because speciation depends on the response of species to abiotic conditions and biotic interactions, understanding diversification processes relates directly to species ecology [7]. While divergent selection pressures are considered as the major drivers of lineage differentiation, the importance of shifts in ecological niche may vary according to the geographical mode of divergence.

In sympatric speciation, reproductive isolation is typically caused by differential adaptation of populations to contrasting environmental conditions [8–11], resulting in ecological speciation [12–15]. Sympatric speciation thus relies on adaptation to distinct ecological niches, which creates barriers to gene flow among populations and so allows lineage divergence [5,16]. In contrast, it is often assumed that the geographical isolation of populations in allopatry enables genetic drift to produce differentiation of sister lineages [17,18]. However, even though allopatric speciation has often been shown to be neutral, it can potentially be associated with ecological divergence [19]—driven by natural selection—when the geographical areas occupied by the diverging lineages show distinct environments [20–22].

Since they are based on different geographical contexts, sympatric speciation is also considered to differ from allopatry because it can directly arise as a consequence of competition for resources [23]. This hypothesis is supported by several models [24], as well as by cases of recently diverging lineages [25,26] and ongoing ecological speciation taking place under natural conditions [27–29]. Over small spatial scales, habitat heterogeneity in the form of climatic variation or differences in community composition may initiate niche shifting in response to competition. Variation in the performance of populations under contrasting environmental conditions may ultimately promote divergence, as shown along a depth gradient in fishes [30].

In contrast to sympatric processes, many events of speciation in allopatry are not associated with any form of ecological divergence [31], resulting in a strong signal of niche conservatism [32,33]. Even when speciation is partly driven by ecological divergence, ecological niche shifts are nonetheless expected to be less frequent than in sympatry. Because physical separation does not allow the differential action of competition, this driver of divergence is absent in allopatry. The alternative view would be that in allopatry we would expect a greater possibility of distinct opportunities available, as we would probably expect both abiotic and biotic conditions to be more distinct than in sympatry.

While ecological niche shifts have been demonstrated in many studies using single-taxon approaches, the examination of the signature of ecological speciation at larger evolutionary and geographical scales remains sparse in the literature [34–37]. This study aims at improving our understanding of the geographical context of speciation associated with ecological niche shifts, at a genus-wide evolutionary level and over large spatial scales.

Using the Palaearctic butterfly genus Pyrgus, we test the hypothesis that the evolutionary rate of the abiotic component of the ecological niche is higher in cases of sympatric versus allopatric speciation. We investigate variation in the evolutionary rates of eight climatic variables for sympatric and allopatric speciation events using increasingly larger delineations of geographical areas (e.g. mountain massifs) and comparing strict and broad definitions of speciation modes to test the influence of geographical areas and speciation modes. We propose an integrative approach combining next-generation sequencing of the mitochondrial genome and ribosomal DNA, with phylogenetic, biogeographic and ecological analyses. We focus on an almost exhaustive sampling of the genus of skipper butterflies Pyrgus (Hübner, 1819; family: Hesperiidae, subfamily: Pyrginae, tribe: Pyrgini; [38]), represented by ca 37 species in the Palaearctic.

Climatic factors associated with temperature and precipitation are critical determinants of ecological niche establishment in insects [39,40]. Thus, it is expected that such ecological factors should be associated with differences in butterfly life-history traits such as growth, development and survival at adult or pre-imaginal stages [41,42]. Whereas biotic components related with host-plant use are typically considered as key ecological factors to explain lineage divergence in phytophagous insects [43,44], this is not the case in Pyrgus butterflies as most of the species are generalists on Potentilla spp. or on various host plants [45].

Here, we targeted all divergent pairs of sister lineages (more than 90% of the Palaearctic taxa being included in our study), that evolved within contrasting geographical contexts, to test if evolutionary rates of the climatic niche are greater when divergence produces lineages with spatially overlapping ranges, than when divergence occurs between regions.

2. Material and methods

(a) Sampling and molecular analyses

(i) Sampling and species determination

Sampling includes 36 of the 37 accepted Pyrgus species in the Palaearctic. Specimens were obtained through either field collection, museums or private donations. Preservation conditions of samples varied according to their origin (see electronic supplementary material, appendix S1, table S1). Specimens from museums or private collections were all dry-pinned and collected between 1929 and 2012. Field samples were collected from 1993 to 2013 and stored in 90% EtOH at −18°C. The sampling was broadened with published and unpublished COI sequences from additional specimens to constitute a final molecular dataset of 132 specimens. For all the studied species, we selected several specimens from different locations across their distribution. Spialia sertorius (Hoffmannsegg, 1804, tribe: Carcharodini), Erynnis montanus (Bremer, 1861, tribe: Erynnini, GenBank accession number KC659955), Eagris sabadius (Gray, 1832, tribe: Tagiadini, GenBank accession number FJ817723), and Celaenorrhinus humbloti (Mabille, 1884, tribe: Celaenorrhinini, GenBank accession number FJ817832.1) were used as outgroup taxa in the phylogenetic reconstructions [46]. Species identification was performed following de Jong [47], based on morphology and upon examination of genitalia when necessary, as Pyrgus species are relatively undifferentiated in overall appearance (figure 1a).

Gathering of species occurrences required for biogeographic and ecological niche analyses was carried out based on exhaustive species’ range distributions. We used all available precise coordinates from the samples included in this study, in combination with occurrence data from the Butterfly Tissue and DNA Collection at the Butterfly Diversity & Evolution laboratory (Institute of Evolutionary Biology, Barcelona, Spain) and from the Global Biodiversity Information Facility [48]. For taxa distributed in Asia, we used all known accessible databases and monographies (see electronic supplementary material, appendix S1, figure S1). We considered only occurrences with a precision of at least 5 km (see below), yielding a total of 16 177 records with a median value of 146 occurrence points per species.

Mitochondrial genomes and nuclear ribosomal DNA regions were analysed in 56 specimens: shotgun libraries for 24 fresh Pyrgus specimens were prepared using the Nextera DNA Sample-Prep-kit (Illumina, San Diego, USA) and further sequenced by Illumina HiSeq 2 × 100 bp (Fasteris, Geneva, Switzerland); 32 Asian and rare European specimens, mostly from private collections, were shotgun-sequenced using a modified version of Tin et al. [51]—the main modification includes a phosphorylation of the denatured extracted DNA allowing the ligation of the same P1 adaptors used for the DMPS technique (electronic supplementary material, appendix S2)—and sequenced by Illumina MiSeq 2 × 300 bp (Lausanne Genomic Technologies Facility, Switzerland). As historical shotgun libraries are more prone to sequencing errors (historical samples are usually characterized by fragmented DNA, i.e. less than 300 bp [52,53]), a large overlap between paired sequenced reads allowed for more accurate sequence assessment.

Both shotgun and DMPS raw reads were demultiplexed and cleaned with Trimmomatic [54]. DMPS libraries were assembled with MIRA 4 [55], using the complete mitochondrial genome of the closely related species Erynnis montanus (GenBank accession number KC659955) as a reference. For each specimen, contigs were aligned independently using CLUSTAL W [56]. Alignments were merged together with COI sequences by pairwise profile alignment and manually checked with BioEdit 7.0.4.1 [57]. For HiSeq and MiSeq shotgun libraries, de novo assemblies were performed using SPAdes [58]. The nuclear reference required for alignment was built using the longest SPAdes contigs that matched along ribosomal DNA using BLAST [59]; the latter included 18S ITS1, 5.8S, ITS2 and 28S regions. The Erynnis montanus complete mitochondrial genome served again as reference for mitochondrial alignment. Contigs were individually mapped against references using the Geneious mapper algorithm [60] with the medium sensitivity option set with two iterations. Multiple alignment was performed using MAFFT 7.017 [61] with the G-INS-I algorithm, 200PAM/k = 2 scoring matrix, gap open penalty of 1.53 and offset value of 0.123.

(b) Genome annotation

The mitochondrial genomes as well as the nuclear ribosomal DNA regions were aligned and subsequently annotated using respectively the complete and fully annotated sequences of the mitochondrial genome (mtDNA) of Erynnis montanus (GenBank number KC659955) and the nuclear ribosomal DNA region of Papilio xuthus (GenBank number AB674749] as references. The alignment and annotation of the mitochondrial and nuclear DNA regions were done in Geneious v. 8.1.7 (Biomatters, Auckland, New Zealand) using MUSCLE [62] to conduct the alignments and the implemented Annotate and Predict functions (a similarity threshold of 80% was applied to predict and transfer the annotations). The annotations were subsequently manually checked and the reading frames adjusted if necessary. The full matrices comprising shotgun-based rDNA and mtDNA—also encompassing DMPS and COI sequences—were produced by pairwise profile alignment using ClustalX [63] and manually corrected with BioEdit 7.0.4.1 [57]. Characters of phylogenetic relevance were selected using BMGE 1.1 [64], yielding a matrix of 20 508 nucleotides—6174 bp and 14 334 bp for the ribosomal DNA and mitochondrial genome, respectively. The two aligned and fully partitioned supermatrices (mitochondrial and nuclear alignments) were concatenated and used to conduct the phylogenetic inferences.

(c) Phylogenetic and dating inferences

The accession of Erynnis montanus was used as the most external outgroup taxon. Partitioned maximum-likelihood (ML) analyses were conducted using RAxML [65,66]. We used the RAxML-HPC v.8 tool implemented on the CIPRES portal (http://www.phylo.org/). Analyses were fully partitioned by providing files containing the set of partitions and were run using the GTRCAT model applied to each partition, as well as 25 rate categories as suggested by Stamatakis [65]. Node support was performed using aBayes calculation in PhyML [67,68], for which the GTR + G analysis on one single partition produced a topology identical to the one produced by RAxML. Finally, because the biogeographic and ecological niche analyses require an ultrametric phylogenetic tree [69], we first performed a dated phylogeny using BEAST 1.8.3 (http://beast.bio.ed.ac.uk/) based on the COI alignment, by applying a classic mitochondrial DNA molecular clock of 2.3% divergence per million year [70]; second the inferred root date was used on the combined partitioned best RAxML tree using a penalized likelihood approach, as implemented in the R ape package [71]. The smoothing value (λ = 100) was established using the cross-validation routine implemented in the ape package and the relaxed model of substitution rate variation among branches was applied (see ape function chronos).

(d) Biogeographic inferences

In this study, the biogeographic scenario of Palaearctic species of Pyrgus will serve as a basis to estimate and compare the evolutionary rates between splits in allopatry versus sympatry along the climatic niche. To infer the biogeographic history of Palaearctic Pyrgus species, we defined biogeographic areas based on natural geographical boundaries and the distribution of the genus as follows.

In order to examine putative effects of the spatial scale at which sympatry is defined, we considered three biogeographic divisions of the Palaearctic, ranging from narrow to broad area definitions (electronic supplementary material, appendix S1, figure S1). We first used 11 areas that were defined as follows: (A) North Africa and the Iberian peninsula (including the Pyrenees), (B) Western Europe including the Alps and surrounding territories limited by the North Sea, the Balkans and including the Italian pre-Alps, (C) Italy, Sardinia, Corsica and Sicily, (D) Eastern Europe including the Balkans and the Carpathians, (E) Anatolia, the Caucasus, Crimea, Southwestern Russia and Mesopotamia, (F) Ural Mountains, (G) Fennoscandinavia and Baltic regions, (H) Central Asia as well as Altai and Cashmere mountain massifs, (I) Central Himalayan range, Eastern Tibet and Sichuan (China), (K) Western Manchuria (China), Mongolia and Siberia (Russia), and (J) South-East Russia and Japan. A finer partitioning resulted in 15 divisions of the Palaearctic, while the broader delineation recasts areas into seven regions. For a detailed description of area partitioning in 15 and seven areas, please refer to the electronic supplementary material, appendix S1, figure S1.

Biogeographic scenarios were inferred using the dispersal–extinction–cladogenesis (DEC) likelihood model [72,73], extended from dispersal–vicariance analysis [74]. The inference of ancestral areas along a phylogenetic tree considers two processes: (i) anagenetic (internode) evolution and (ii) cladogenetic range evolution (at nodes) [73]. Anagenetic evolution is governed by a transition matrix (Q-matrix). Q-matrices, designed independently for each biogeographic division of the Palaearctic, were adjusted to reflect area connectivity by assigning high dispersal probabilities to 100 (=1) when areas were adjacent and lower dispersal probabilities to 10−3 (=0.001) when they were not, with a few cases where they were set to 10−1.5 (= 0.032) for intermediate cases where areas were not strictly adjacent (electronic supplementary material, appendix S1, table S2). We did not consider the intraspecific level because coalescent processes may prevent lineage sorting and the formation of well-segregated groups. Cladogenetic range processes at each node were assessed by performing DEC analysis. Analyses were performed using the RASP software [75] providing the ancestral area and its probability of assignment at each node.

Based on the DEC ancestral areas reconstructions and the corresponding events route, we attributed speciation modes (i.e. sympatry versus allopatry) for each node leading to pairs of sister species (i.e. at the level of the most recent common ancestors of these sister lineages). We firstly applied a broad definition of sympatry, including dispersal events during the subsequent anagenetic process and cases of more complex geographical context (e.g. divergence events implying more than one common area between sister species). Following Buerki et al. [76], instances of peripheral isolation were registered under allopatry. A more stringent definition of sympatry was also applied by selecting only the nodes unambiguously showing a split either in sympatry or in allopatry. Climatic niche analyses were conducted for broad and strict definitions of sympatry and for the three partitioning types of biogeographic areas.

(e) Estimation and comparison of evolutionary rates along climatic dimensions of the ecological niche

Based on species occurrences gathered from all known records of Pyrgus, we extracted climatic data using environmental layers from Hijmans et al. [77] on http://www.worldclim.org, with a grid size of 5 arc-minutes resolution. We selected eight BioClim variables among the 19 available. Related to temperature and precipitation, these variables were chosen according to their ability to describe basic general features of the environment important for butterfly ecology [78,79]. Temperature was represented by the annual mean temperature (Bio1), the maximal temperature of the warmest month (Bio5), the minimal temperature of the coldest month (Bio6), the mean temperature of the warmest quarter (Bio10) and the mean temperature of the coldest quarter (Bio11). Precipitation was described by the annual precipitation (Bio12), the precipitation of the warmest quarter (Bio18) and the precipitation of the coldest quarter (Bio19).

Note that we used here a species-based rather than a tip-based approach (i.e. with splits of interest selected upstream of the intraspecific level) to comply with requirements of algorithms that compute evolutionary rates along phylogenies [80] and to increase the sampling for a more accurate description of the ecological niche. Following climatic data extraction of specimen localities, we established ecological-niche evolutionary rates using phylogenies with one single-tip per species, each of them summarized by mean values for each of the eight BioClim variables. To account for branch-length uncertainty, we randomly sampled from the dated phylogenetic tree a total of 100 single-tipped-species trees with identical topology but different relative branch lengths.

We compared evolutionary rates for climatic factors (each of the eight BioClim variables) along branches of those 100 trees using a maximum-likelihood method implemented in the brownieREML function of the Phytools R package [81]. This function uses restricted maximum likelihood to fit the Brownian rate variation model [69] in order to compare different rates of continuous trait evolution and test whether a single-rate model is favoured over a multiple rates model that allows multistate trait evolution. In the present analysis, the alternative model considers two evolutionary rates expected to differ between pairs of sister species that evolved in sympatry versus allopatry, as previously registered according to the biogeographic analyses. The multiple continuous traits of evolution correspond in this case to differential evolutionary rates of ecological niche and are reflected by the different climatic factors. Log-likelihood values of the estimated parameters were used to calculate the Akaike information criterion (AIC), which allowed selection of the best model (i.e. single versus multiple rate model). When AIC indicated a significantly better fit of the multiple versus simple model to the data (p value < 0.05), we extracted the values of the estimated evolutionary rates (sigma2) and identified rate shifts for both speciation modes (i.e. sympatry versus allopatry). Significance levels were assessed by randomly assigning allopatric versus sympatric speciation modes to the pairs of sister species and repeating the Brownie analysis, in order to generate evolutionary rates expected under a null-hypothesis. Based on the results obtained in those randomized datasets, the ratio of evolutionary rates in sympatry over allopatry was computed and compared to the ratio of the empirical estimates using Wilcoxon sign-rank tests and applying Bonferroni correction to account for multiple testing.

To test whether there was significant climatic variation between biogeographic areas we performed a principal component analysis (PCA) on Bioclim extracted climatic data with the ade4 R package [82] and then compared the differences among the means on each of the two first PCA axes using one-way ANOVA and post-hoc Tukey test, with the biogeographic region as a grouping factor.

3. Results

(a) Alignment, annotation and phylogenetic inferences

All of the protein-coding genes (13) and rRNA regions (two: 12S and 16S) of the mtDNA genome were included in the supermatrix, but we only managed to successfully annotate 12 of the 22 tRNA regions (54.5%). The aligned nuclear ribosomal DNA matrix included the following five partitions: 28S rRNA, ITS1, 5.8S rRNA, ITS2 and 18S rRNA. The combined supermatrix thus contained 32 partitions and 20 508 characters (14 334 from the mtDNA genome and 6174 from the nuclear ribosomal region; see below). We did not find any incongruence between the mtDNA and ribosomal DNA topologies (when considering aBayes supports higher than 0.7); however, the combined partitioned inference mainly reflects the information provided by the mtDNA genome, since the inference from the nuclear ribosomal DNA region shows little support (data not shown).

We obtained mtDNA and rDNA sequences for 56 of the 58 shotgun-sequenced specimens—assemblies were not satisfactory for the representatives of P. alpinus and P. centaureae. Our dataset therefore comprised samples from 34 of the 37 Palaearctic species described and consisted of 132 specimens (including five outgroups), with variable sequence lengths ranging from 121 bp to 20 508 bp. DMPS of historical and fresh samples yielded sequences from 121 bp to 7124 bp, MiSeq shotgun libraries of historical and fresh samples from 542 bp to 15 597 bp and HiSeq shotgun libraries of fresh samples from 19 397 bp to 20 508 bp. The median number of nucleotides analysed per specimen (excluding those for which only COI was available) was 11 744.

The phylogeny largely retrieves relationships described by de Jong [47]. The monophyly of all species was supported with aBayes support higher than 0.7 and 30 among the 52 nodes describing among-species relationships within Pyrgus (i.e. 58%) exhibited support higher than 0.95 (figure 1b). Exceptions are: (i) nine species represented by a single specimen, (ii) seven taxa belonging to the P. alveus complex, a group with debated species circumscriptions, and (iii) P. melotis, a species sharing an identical COI sequence with specimens from P. malvae. The position of the outgroups was confirmed with a support of 0.87 and the dating analysis estimated the origin of the Pyrgus genus at 4.04 million years ago (see electronic supplementary material, appendix S1, figure S2).

(b) Biogeographic inferences

The Lagrange biogeographic scenario that takes into account phylogenetic and branch length uncertainty is given in the electronic supplementary material, appendix S1, table S3. Both allopatric and sympatric nodes used in this study were relatively well distributed across the tree depth. Attribution of speciation modes based on the broad definition of sympatry results in 16, 18 and 16 speciation events in sympatry and 10, 8 and 10 in allopatry for the partitioning of areas into seven, 11 and 15 regions, respectively (see an illustration of speciation mode assignment in figure 1b). Sympatric speciation events under the strict definition were inferred at 12, seven and six nodes and allopatric speciation was invoked at four, four and four nodes for biogeographic divisions into seven, 11 and 15 regions, respectively (electronic supplementary material, appendix S1, table S3).

(c) Estimates and comparison of ecological niche evolutionary rates

Based on the AIC criterion, analyses of the evolutionary rate of the climatic niche made with the brownieREML R function significantly favoured the multiple model that considers different rates of evolution to the single Brownian-motion model. Results of the comparison of the evolutionary rates are shown in table 1, for both definitions of speciation mode assignment and the three definitions of biogeographic areas. When evolutionary rates are inferred from area partitioning that considers 11 or 15 regions, the outcome is consistent across the majority of the climatic variables individually tested and exhibits a significant p-value in most analyses. When using the strict assignment of sympatry, the multiple model applied on Bio6 and Bio11 was not significantly different from the single model. Bio5 and Bio10 appeared to have higher evolutionary rates in allopatry than in sympatry. When using seven areas in biogeographic inferences, evolutionary rates of the ecological niche were always higher for the case of allopatric speciation.

Summary of estimates and comparisons of the evolutionary rates of climatic niche. Results of Brownie analyses are shown for each of the eight BioClim climatic variables considered each biogeographic area definition of the Palaearctic (i.e. seven, 11 and 15 regions), and both broad and strict speciation mode assignments. The p-values of the Wilcoxon sign-rank test that compare if observed rate estimates are significantly different from random rate estimates, are shown after multiple testing correction. Evolutionary rates of ecological niche ± s.e. after log transformation are indicated when the p-value of the Wilcoxon test is significant. The mode of speciation with the higher rate is indicated with an italic evolutionary rate estimate. This analysis was performed for all nodes, irrespective of their probability of ancestral area(s); the number of nodes used is indicated.

Climatic differences between the biogeographic areas are illustrated in electronic supplementary material, appendix S1, figure S3. The two first axes of the PCA explained 60.6% and 30.0% of the variance, respectively. The first axis was most strongly correlated with the following variables: Bio1 (r = 0.896), Bio5 (r = 0.934) and Bio10 (r = 0.946). The second axis was mostly explained by Bio6 (r = 0.848) and Bio11 (r = 0.785). Results are given for the three definitions of biogeographic areas for the first PCA axis. The percentages of significant p-values from Tukey tests for pairwise comparisons of means were 42.9%, 60.0% and 72.4% for the division into seven, 11 and 15 areas, respectively.

4. Discussion

The signature of shifts along abiotic ecological dimensions during lineage divergence has largely been left unexplored at the macro-evolutionary scale (but see [34–37]). Ecological niche shift has been shown to be at work in a large number of cases of ongoing speciation. Alternatively, ecological niche assortment—an extension of Losos’ size assortment concept [83] to the multidimensional ecological niche—could also allow lineages to successfully colonize contrasted habitats in sympatry. Despite the fact that our experimental design does not investigate potential consequences of ecological niche assortment at the community level, it demonstrates that the footprint of ecological niche differentiation is visible at a wide phylogenetic scale.

Here, we provide evidence, based on the analysis of Palaearctic Pyrgus butterflies, that the evolutionary rate of abiotic dimensions of the ecological niche is usually larger when lineages diverge in regional sympatry than in allopatry, at least when defining relatively narrow geographical areas at which sympatric versus allopatric speciation processes act (see table 1); an outcome that fits expectations of the competitive displacement theory [8] or the fitness advantage of specialization to distinct conditions [84]. However, our results point to the fact that the scale employed to define geographical areas for producing biogeographic scenarios may have a strong impact on the results. When considering broad area definitions (as in the case of our seven-areas dataset), allopatric events show a higher rate of evolution of the climatic niche than sympatric events. This is actually not surprising, as broader areas would less finely translate real geographical barriers. This may be confirmed by the low percentage of significant p-values in the Tukey tests for multiple means comparisons of climatic components among areas (42.9%), which is the lowest among all three datasets, indicating that with a seven-area definition, areas are less dissimilar climatically than when applying a more narrow definition (see electronic supplementary material, appendix S1, figure S3a). As a consequence, speciation events that would be classified as allopatric under a narrower definition of areas, would in that case be recorded as sympatric. Because branches of those splits usually have lower rates of evolution, an incorrect assignment will decrease the average values of evolutionary rates for sympatric speciation events, and blur the distinction between the two modes.

Not only does the number of areas considered have an impact on the biogeographic scenario at work, but so does the way we define sympatry. Indeed, considering a strict definition of sympatry leads to a strong decrease in the number of nodes of interest, which may then produce erroneous results because of the stochasticity associated with small numbers of observations. Although sympatric speciation was still associated, generally, with a higher evolutionary rate of the climatic niche for both the 11-area and 15-area datasets, evolutionary rate estimates were higher in allopatry for one out of eight climatic variables for the 11-area dataset. We thus suggest that the definition of biogeographic areas and the number of nodes taken into account should be considered with caution to guarantee the robustness of the analyses and the subsequent interpretation of results. We therefore recommend that studies aiming at applying the same methods to other biota should (i) favour a fine geographical division of biogeographic areas consistent with dispersal characteristics of the study species and the geographical barriers of the study area and (ii) use phylogenies with a high number of nodes in order to reduce stochasticity and, eventually, to restrict node selection to those that display high phylogenetic support and high probability of ancestral areas assignment. In addition, biogeographic inference methods using more than a single input phylogenetic tree (e.g. Bayes–Lagrange, S-DIVA as implemented in RASP [75]) may account for uncertainty in topologies, branch lengths, and therefore in ancestral areas assignment.

Notwithstanding with those potential limitations, the general pattern of our result is in line with a large number of specific case-studies [25–28,85] but also evidences the substantial role played by niche shifting in sympatry at the macro-evolutionary level. At a wider taxonomical scale, it also complements the findings from one of the few works on diversification of lepidopteran lineages, in which allopatric divergence without niche shift is shown to be commonly underestimated [86]. Although one might consider that allopatric cladogenesis may require niche shifts to accommodate the contrasting climatic conditions found in the occupied areas (see electronic supplementary material, appendix S1, figure S3b and S3c), these results reveal the importance of abiotic niche differentiation in sympatric speciation.

This outcome contrasts with recent work on the geographical context of plant speciation, which has acknowledged that ecological divergence did not vary as a function of range overlap between sister lineages [23,87]. On the contrary, our hypothesis is particularly consistent with conclusions drawn by Bush & Smith [88] in a study where they evidence the role of competition in sympatry as a leading force for niche shifting. We suggest that similar mechanisms played an important role in shaping the evolutionary history of Pyrgus butterflies by promoting high rates of niche shifting in sympatric sister species. Our results represent compelling evidence that the footprints of micro-evolutionary processes driven by abiotic ecological components during sympatric versus allopatric speciation are reflected at the macro-evolutionary scale.

Data accessibility

Authors' contributions

C.P., L.P., S.B. and N.Al. elaborated the study, analysed the data and wrote the manuscript. N.Ar. led sequence assembly and alignment. T.S., A.M.-Y. and L.F. designed wet laboratory procedures. R.V., V.D., J.H.-R., E.B., Y.C. and I.K. provided samples and data. All authors contributed in the form of discussions and suggestions, and approved the final manuscript.

Competing interests

The authors declare no competing interests.

Funding

This work was supported by an SNSF Professorship PP00P3_144870 awarded to N.A., by grants CGL2013-48277-P, PR2015-00305 (MINECO) and CGL2016-76322-P (AEI/FEDER, UE) to R.V. and a Marie Curie International Outgoing Fellowship within the 7th European Community Framework Programme to V.D. (project no. 625997).

Acknowledgements

We are grateful to Catherine Berney for advice on molecular issues, to Yves Gonseth and Brent Emerson for insightful discussions, to the Lausanne Genomic Technologies Facility and Fasteris SA for the sequencing job, to Jindrich Majer, Jana Slancarova and Zdenek Fric for providing data and literature sources, to Russell Naisbit from Albion Science Editing for correcting and editing the manuscript, and to three anonymous reviewers for constructive comments on a previous version of this manuscript.