This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Abstract

Comparative population studies can help elucidate the influence of historical events upon current patterns of biodiversity among taxa that coexist in a given geographic area. In particular, comparative assessments derived from population genetics and coalescent theory have been used to investigate population dynamics of bacterial pathogens in order to understand disease epidemics. In contrast, and despite the ecological relevance of non-host associated and naturally occurring bacteria, there is little understanding of the processes determining their diversity. Here we analyzed the patterns of genetic diversity in coexisting populations of three genera of bacteria (Bacillus, Exiguobacterium, and Pseudomonas) that are abundant in the aquatic systems of the Cuatro Cienegas Basin, Mexico. We tested the hypothesis that a common habitat leaves a signature upon the genetic variation present in bacterial populations, independent of phylogenetic relationships. We used multilocus markers to assess genetic diversity and (1) performed comparative phylogenetic analyses, (2) described the genetic structure of bacterial populations, (3) calculated descriptive parameters of genetic diversity, (4) performed neutrality tests, and (5) conducted coalescent-based historical reconstructions. Our results show a trend of synchronic expansions across most populations independent of both lineage and sampling site. Thus, we provide empirical evidence supporting the analysis of coexisting bacterial lineages in natural environments to advance our understanding of bacterial evolution beyond medical or health-related microbes.

It is clear that patterns of biodiversity across phylogenetically distinct taxa may be influenced by shared historical factors (Chan, Schanzenbach & Hickerson, 2014; Hope et al., 2014). Despite the importance of microorganisms in ecosystems (Allison & Martiny , 2008; Strickland et al., 2009), the scale of the impact of historical factors remains poorly understood and little is known about the population dynamics of natural microbial populations, which has greatly hindered our understanding of the processes determining diversity. Most studies on microbial population dynamics have analyzed demographic patterns of bacterial pathogens, undertaken primarily to understand disease epidemics and population expansion events of human pathogens (Pybus et al., 2001; Wirth et al., 2007; Tazi et al., 2010). Studies of demographic trends in natural bacterial populations have been scant (Guttman, Morgan & Wang, 2008), due in part to limited sampling of populations from different lineages at similar temporal and geographic scales.

In this context, the Cuatro Cienegas Basin (CCB) harbors high levels of bacterial diversity, arguably due to environmental variation (e.g., salinity) across the aquatic systems within the basin (Cerritos et al., 2011). However, there is no conclusive evidence that identifies environmental conditions or geographic distance as predictors for the presence of certain bacterial groups (Escalante et al., 2008; Rebollar et al., 2012). Thus the influence of shared historical factors in population dynamics could be a plausible explanation for the observed diversity in CCB. A collection of coexisting bacterial isolates from the aquatic systems in this area has been built for over a decade (Cerritos et al., 2011; Rebollar et al., 2012; Rodríguez-Verdugo et al., 2012) and represents a unique opportunity to investigate the historical population patterns of coexisting bacterial lineages in a natural setting.

In the present work, we tested the hypothesis that a shared history in the CCB region has left common genetic signatures across phylogenetically diverse microbial populations. Using population genetics and coalescent-based approaches, we assessed the population history of lineages from two closely related genera of Firmicutes, Bacillus and Exiguobacterium, as well as lineages of Gammaproteobacteria from the genus Pseudomonas. The selected lineages are all common in the natural setting of the CCB aquatic system, and we anticipated that the genetic variation present among coexisting lineages would reveal a signature of common historical dynamics independent of phylogenetic relationships. Multilocus sequence typing (MLST) markers were used to (1) perform comparative phylogenetic analyses, (2) describe the genetic structure of bacterial populations, (3) calculate descriptive parameters of genetic diversity, (4) perform neutrality tests, and (5) conduct historical demography reconstructions.

Consistent with our hypothesis, we identified a common trend of expansion in the studied populations over a similar time frame that occurred independently of phylogenetic relationships. These results provide empirical evidence that analyzing coexisting bacterial lineages in natural environments can advance our understanding of bacterial evolution, beyond medical or health-related species.

Materials & Methods

Study sites and sampling

Surface water and sediment samples were collected between 2003 and 2009 (Rodríguez-Verdugo et al., 2012; Rebollar et al., 2012) from four aquatic systems within the CCB: Churince (C), Los Hundidos (H), Mesquites (M), and Pozas Azules (Pa). Pseudomonas isolates were collected only from C. All isolates were obtained using previously described methods (Rebollar et al., 2012). All samples were collected under the “Vida Silvestre” permit 0531 FAUT-0230 granted by the Mexican government agency: “Comisión Nacional de Áreas Protegidas” (CONANP).

Water and sediment samples were diluted at 1:10; 1:100; 1:1,000; and 1:10,000 using saline solution (1% NaCl). Subsequently, 200 µl of each dilution was plated using either marine agar (Difco 2216) for Firmicutes or GSP agar (Kielwein, 1971) for Pseudomonas. Firmicutes isolates were incubated for 24 h at 37 °C; Bacillus and Exiguobacterium colonies were pale or bright orange, respectively. All orange colonies were purified by single-colony isolation. Pseudomonas isolates were incubated for 48 h at 30 °C. Pseudomonas colonies were selected based on a change in the color of the GSP media to purple. All isolates were stored at −80 °C in 20% (w/v) glycerol.

Phylogenetic reconstructions

The complete sequences of the 16S rRNA gene and partial sequences of the MLST genes were edited and aligned using BioEdit (Hall, 1999). Representative 16S rRNA sequences for the three genera were obtained from GenBank and were included in the alignments as references (see Figs. S1A–S1C for accession numbers).

For all genera, maximum likelihood (ML) phylogenies were constructed using (i) 16S rRNA sequences and (ii) concatenated alignments of MLST sequences. The program jModelTest v.2.1.3 was used to determine the best nucleotide substitution model for each alignment (Posada, 2008; Darriba et al., 2012). For the 16S rRNA gene, TPM2uf+I+G, TIM1+I+G and GTR+I+G were the substitution models selected for Bacillus, Exiguobacterium and Pseudomonas, respectively. GTR+I+G, TIM2+I+G and GTR+I+G were the models selected for the Bacillus, Exiguobacterium, and Pseudomonas concatenated MLST loci alignments, respectively. Phylogenetic relationships were reconstructed using PhyML v.3.0 (Guindon & Gascuel, 2003), with the corresponding DNA substitution models, tree improvement was carried out by Subtree Pruning and Regrafting, and branch support was evaluated by 1,000 bootstrap pseudo-replicates.

Population structure

We defined populations as the minimal study unit in our collection. Populations have to be defined considering both geographic and genetic criteria. Given that it is possible to find populations (genetic pools) that are not site-restricted we investigated population structure by genetic differentiation, which is defined as changes in the frequency distribution of haplotype variants among subpopulations (Hartl & Clark, 1997). We estimated pairwise FST values among sampling sites within lineages. With this approach we defined a single population as all isolates from a lineage that exhibited no significant genetic differentiation as measured by pairwise FST estimates. Pairwise FST estimates were obtained for each monophyletic lineage across all sampling sites using Arlequin 3.5 (Excoffier & Lischer, 2010), with 1,000 iterations. For the Pseudomonas isolates, FST analysis was not performed since all isolates were from the same sampling site.

The FST approach implicitly incorporates geographic location and genetic criteria in the investigation of population structure. However, it is possible that genetic structure exists for each lineage when looking at individual genetic variants within each population. Thus, we investigated potential substructure within the groups defined by FST by taking a Bayesian approach implemented in BAPS 6 (Tang et al., 2009). The Bayesian analysis of population structure was performed using the option for linked loci, specifically developed for MLST data (Tang et al., 2009). A maximum number of clusters (K) was set to ten, or equal to the number of individuals if these were fewer than ten. Each analysis was replicated ten times.

Once populations were defined, standard measures of nucleotide diversity (π) and the mutation parameter Watterson’s θ were estimated, together with neutrality tests (Tajima’s D, Fu and Li’s F* and D* and Fu’s FS) using DNAsp v4.1 (Rozas et al., 2003). Details of these calculations can be found in the Supplemental Information.

Historical population dynamics

We performed an Extended Bayesian Skyline Plot (EBSP) analysis as implemented in Beast v.1.7.5 (Drummond & Rambaut, 2007; Heled & Drummond, 2008). The EBSP infers past population dynamics from a sample of contemporary sequences, taking into account the genealogical stochasticity of the coalescent (Ho & Shapiro, 2011; Drummond et al., 2005). Additionally, this method does not depend on a specific a priori demographic model, but infers the number of population size changes directly from the data. As a result, it provides a measure of statistical credibility of the inferred number of population size changes compared to the alternative of constant population size (Heled & Drummond, 2008).

For the EBSP analysis, we used all MLST genes, a strict molecular clock, and the same nucleotide substitution models used for phylogenetic reconstruction to estimate changes in population size for each genetically homogeneous population. The substitution rate used for the two Firmicutes genera (Bacillus and Exiguobacterium) was 7 × 10−9 substitutions/per site/generation, obtained from a dated phylogenetic tree (Maughan, 2007). The substitution rate used for Pseudomonas was 2.8 × 10−8 substitutions/per site/generation, based on an experimental evolution study of P. fluorescens (Barrett, MacLean & Bell, 2006). All time estimates obtained were expressed in number of generations. Changes in population size were expressed as a function of the product of Ne and the generation time (Ne∗t). All analyses were run for 50–100 million generations, until adequate mixing was achieved. Ten percent of burn-in was removed and the sampling was done every 1,000 chains. The rest of the parameters were set according to the guidelines of Heled & Drummond (2008). Results were analyzed with Tracer v.1.5 and LogCombiner v.1.7.5 (Drummond & Rambaut, 2007).

Results

To investigate the influence of a common habitat in the genetic variation of bacterial populations, we analyzed MLST data from a collection of isolates across three bacterial genera (Bacillus, Exiguobacterium and Pseudomonas) sampled from water bodies in the geographic region of CCB, Mexico.

The phylogenetic history of CCB lineages

Phylogenetic reconstructions based on MLST and 16S rRNA sequences were used to assign isolates to monophyletic lineages (Figs. 1 and S1 respectively). MLST phylogenies identified at least two lineages within each genus. We named the lineages according to genus: “B” for Bacillus, “E” for Exiguobacterium, and “P” for Pseudomonas.

Phylogenies were constructed using the maximum likelihood approach and are based on concatenated sequences of at least four MLST loci. (A) Bacillus lineages with B. marisflavi as outgroup, and citC, gltX, hsp70, recA and spo0A loci, (B) Exiguobacterium lineages with E. auranticum as outgroup, and citC, hsp70, recA and rpoB loci, and (C) Pseudomonas lineages with Escherichia coli as outgroup, and acnB, gyrB, recA and rpoD loci. The scale of the bar represents the number of substitutions per site.

Among Bacillus isolates, we identified two clusters, B1 and B2 (Fig. 1A); their closest relatives were B. aquimaris, B. vietnamensis, B. marisflavi, and B. coahuilensis (Fig. S1A). Exiguobacterium isolates clustered into three well-defined groups: E1, E2, and E3 (Fig. 1B). For clusters E1 and E2, the most closely related species was E. aurantiacum, and E. profundum was the closest relative of cluster E3. Pseudomonas isolates were divided into three groups: P1, P2, and P3 (Fig. 1C). P1 included the majority of isolates and, according to 16S rRNA sequences, was closely related to P. oryzihabitans. P2 isolates were closely related to P. otitidis and P. aeruginosa, while P3 had recently been described as a new species, P. cuatrocienegasensis (Escalante et al., 2009) (Fig. S1C).

Population structure

Populations defined by pairwise FST values corresponded with sampling-site pairs with non-significant FST values (Table S3). We named the resulting populations indicating lineage and sampling site as: B1_C, B1_HPa, B1_M, B2_C, B2_MPa, E1_CHPa, E1_M, E2_CH, E2_M, E3_CM, P1_C, P2_C, and P3_C. Further exploration of structure within lineages was performed with BAPS and showed clusters that are shared among sites within lineages. In all cases, differences in frequency distribution of haplotypes (clusters) correspond with the population structure that we identified through pairwise FST (Fig. 2). Therefore, we delimited populations based on FST estimates since it incorporates both the geographic and genetic components in the definition of populations.

The analysis was conducted for each studied lineage within each genus (Bacillus = B1, B2; Exiguobacterium = E1, E2, E3; Pseudomonas = P1, P2, P3) using the option for linked loci. The maximum number of clusters (K) was set to 10, or equal to the number of individuals if these were fewer than 10. Each analysis was replicated 10 times. The columns represent a single multilocus genotype and are identified with distinctive colors corresponding to different genetic clusters. Different populations are graphically denoted with a dark and wide line between populations with significant FST values, which are also shown.

Population history: population genetics and coalescent approaches

Results of the genetic diversity analyses and neutrality tests (including Fu’s Fs; Fu, 1997) are presented in Table S4. Neutrality tests suggested expansion events but were inconclusive. In our data set, most values for Tajima’s D and Fu & Li’s tests were negative, although non-significant, which is suggestive of expansion events of populations. However, Fu’s Fs, a test that was explicitly designed to evaluate population expansions, was not significant in most cases. Overall, these classic population genetics estimates did not allow for strong conclusions about historical population events.

To further explore the influence of historical processes in shaping the diversity of these bacterial populations, we used a coalescent approach. Extended Bayesian Skyline Plot (EBSP) analyses of populations (genetic pools) showed signals of expansion in 9 of the 13 populations, of which 4 have statistical support for at least one change in population size (Fig. 3; for statistical support values of the population size changes see Table S4). Within Bacillus, an expansion (although not statistically supported) was detected in one population (B1_HPa) that included most B1 isolates. Plots for the other two B1 populations (B1_M and B1_C) showed constant population size across generations (Fig. 3). The B2 plots showed statistical support for change in population size in at least one population (B2_MPa) and constant population size in another (B2_C). Within Exiguobacterium, there was evidence of expansion in one of two E1 populations (E1_M), in one of the two E2 populations (E2_CH) and a trend of expansion (not statistically supported) in the only E3 population (E3_CM). Finally, the plots for all Pseudomonas populations displayed signals of expansion, although only P2_C was statistically supported. Overall, a trend of expansion was observed for the studied populations independent of phylogenetic relationships or sampling sites. Moreover, the skyline plots revealed a common time frame for the start of expansions (20,000 to 30,000 generations ago, Fig. 3), regardless of lineage- or locus-specific parameters.

When analyzing bacterial populations caution should be taken in the delimitation of populations, which has to consider key aspects such as the degree of genetic and ecological diversity. It is also important to consider the molecular resolution at which populations are defined (Kopac & Cohan, 2011). The more variable loci studied, the more populations may be identified. In our case, few and highly conserved loci may result in a coarse population delimitation. Nonetheless, to include both genetic and ecological aspects in the delimitation of populations, we defined individual populations for each lineage by analyzing genetic structure across sampling sites. The analysis showed that populations defined by the FST statistic include different sampling sites with contrasting environmental conditions (Rebollar et al., 2012), which is in accordance with previous work where no evidence of geographic or environmental barriers to dispersal exists (Escalante et al., 2008; Rebollar et al., 2012). We explored the existence of subpopulations within the FST-defined populations following a Bayesian approach implemented in BAPS (Tang et al., 2009), and we found more clusters that subdivided even further the populations defined with FST estimates. However, genetic diversity within the identified populations (π = 0.00032–0.04698 and Watterson’s θ = 0.00073–0.047; Table S4) is in the range of values reported for natural bacterial populations (Roberts & Cohan, 1995; Vos & Velicer, 2006), meaning that the observed diversity is within the range of normal polymorphism in bacterial populations. Overall, FST, BAPS, and diversity estimates support our population delimitation and indicate that we are not including unusually divergent individuals into the studied populations which might confound the population genetics and expansion estimations. Given that classic population genetics estimates did not allow for strong conclusions about historical population events, we used a coalescent approach to further explore the influence of historical processes. From this approach, we found evidence showing that common historical events were population expansions that occurred in 9 out of 13 populations (Fig. 3). Even more, it is noteworthy that at least one population of each genus presented evidence of expansion that was statistically supported. In clonal and free-living populations of bacteria, evolutionary models predict that populations experience strong fluctuations, either by selective sweeps or by metapopulation dynamics (Fraser et al., 2009; Kopac & Cohan, 2011; Shapiro & Polz, 2014). The results obtained in this study can be explained by these models since the analyzed lineages come from a natural setting and present no evidence of recombination (Table S2; Roberts & Cohan, 1995; Spiers, Buckling & Rainey, 2000; Rebollar et al., 2012). Without recombination, clonal clusters emerge by random mutation, but can quickly drift to extinction unless they have a major selective advantage, in which case their presence could suggest the emergence of new ecologically distinct populations (Shapiro & Polz, 2014) or ecotypes (Kopac & Cohan, 2011). Nonetheless, given that we do not have conclusive evidence that identifies environmental conditions or geographic distance as predictors for the presence of certain bacterial groups or populations (Escalante et al., 2008; Rebollar et al., 2012), the influence of shared historical factors in population dynamics could be a plausible explanation for the observed population expansion events in CCB. Given that we used just a few loci to delimit the studied populations, it should be noted here that we cannot rule out the possibility that the observed expansions may be a reflection of the emergence of closely related populations (undetectable to our molecular markers), and not population expansions in size. This ambiguity could be resolved if more variable loci, even complete genomes, are analyzed. Despite the alternative interpretations of the results, we consider notable the fact that we identified the same type of events (either diversification or population expansions) in most populations in the same time frame.

Population expansions in our data were identified to occur during the same time frame (≅20,000 generations ago). Our estimation of time to the most recent population change is calculated in number of generations. Although generation times may vary considerably among bacterial lineages (Mason, 1934; Errington, Powell & Thompson, 1965), laboratory observations of growth rates of the lineages included in this study do not suggest major differences in generation times between them, which supports the synchrony argument. In this study, we used previously reported estimates for Firmicutes (Bacillus and Exiguobacterium) and Pseudomonas to parameterize the coalescent simulations. We used the same substitution rate for Bacillus and Exiguobacterium since little information is available for Exiguobacterium and also because Maughan (2007) did not detect a considerable rate of heterogeneity between sporulating and non-sporulating Firmicutes. In this respect, it should be acknowledged that values of substitution rates are critical for the estimation of the temporal scale for the demographic events that result from ESPB analyses (Ho et al., 2011) and ideally should be directly estimated for the studied sample. We are aware that this could impact the temporal scales of our skyline plots, and be of particular relevance when comparing lineages with different substitution rates, but we do not have any reason to doubt the substitution rates reported in the literature. Further work should try to directly estimate substitution rates for the studied lineages to minimize the possibility of different time scales for the expansion events.

Despite the fact that expansions are expected in bacterial populations, the time of such expansions may differ among lineages because they depend on specific traits such as mutation rate, life history, and population structure (Avise, 2000; Avise, 2009; Kuo & Ochman, 2009; Maughan, 2007; Woolfit & Bromham, 2003). The synchrony of expansions in coexisting populations of different lineages are not expected to be observed unless shared environmental and historical factors have a major influence on population dynamics obscuring evidence of lineage specific adaptation, as has been observed in different organisms including “human associated” pathogens (Ornelas et al., 2013; Comas et al., 2013; Falush et al., 2003). The synchrony of population expansions supports the argument that current genetic variation may be the result of a major regional event over all populations.

Currently, we do not have the necessary information or samples from the appropriate temporal and spatial scales to determine the environmental changes associated with the historical events that influenced population dynamics of the studied lineages. However, we can propose an approximate timespan in which these demographic events occurred and hypothesize on the environmental causes behind our observations. Most estimates of generation times for bacteria have been determined in laboratory conditions, which may be considerably shorter than generation times in the wild (for Escherichia coli see Pierucci, 1972; Ochman, Elwyn & Moran, 1999). Thus, we estimated the approximate absolute time of population expansions based on generation times for aquatic bacteria estimated in conditions that are similar to their natural habitat conditions (approximately between 75 and 300 generations per year; Jannasch, 1969; Hendricks , 1972). We selected a generation time at the lower limit (75 gen/yr), because it is possible that the bacteria in our study may grow slowly, given the extreme limitation of nutrients, in particular phosphorous, found in the CCB aquatic system (Souza et al., 2006; Souza et al., 2012; Souza et al., 2008). Thus, if the synchronic event occurred 200–300 years ago for most populations, it is tempting to suggest that the population expansions we observed reflect changes resulting from the recent human activity in the CCB area. In particular, an increase in anthropogenic activity during the late eighteenth and nineteenth centuries has been reported, primarily centered around agriculture and ranching (Alessio-Robles, 1946) which coincides with the time frame of expansions found here.

Conclusions

This study investigates historical population events of coexisting but phylogenetically distant bacterial lineages. To our knowledge, this is the first report of a synchronous signature of population expansion that occurred independently of phylogenetic relationships, which is in accordance with our initial hypothesis. Our findings provide strong evidence for the potential of comparative population genetics to address questions about the influence of shared historical factors in the population evolutionary processes of naturally occurring non-host associated bacteria. This information may be important for natural resource management in the context of the ecosystem services that microorganisms provide.

Supplemental Information

Supplementary information text

16S rRNA phylogenetic reconstructions of lineages studied

Phylogenies were constructed using the maximum likelihood approach and the complete sequence of the 16S rRNA gene. (a) Bacillus lineages with Geobacillus jurassicus as outgroup, (b) Exiguobacterium lineages with E. auranticum as outgroup, and (c) Pseudomonas lineages with Escherichia coli as outgroup. Bold font denotes representative sequences of the lineages studied. The scale of the bar represents number of substitutions per site.

Credibility intervals (95%) for estimates of number of population size changes

Acknowledgements

We thank Dr. Evan Carson, Dr. Michael Travisano, Dr. Daniel Piñero, Dr. Juan Pablo Jaramillo and Dr. Santiago Ramírez-Barahona for their constructive reviews of the manuscript. We also thank Laura Espinosa, Dr. Erika Aguirre and Jaime Gasca-Pineda for technical support during the development of the project.

This paper constitutes part of the doctoral research of the first author, who thanks the Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de México (UNAM) and the Consejo Nacional de Ciencia y Tecnología (CONACyT).

Alejandra Moreno-Letelier conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Luis E. Eguiarte and Valeria Souza conceived and designed the experiments, contributed reagents/materials/analysis tools, reviewed drafts of the paper.

Field Study Permissions

The following information was supplied relating to field study approvals (i.e., approving body and any reference numbers):

All samples were collected under the “Vida Silvestre” permit 0531 FAUT-0230 granted by the Mexican government agency: “Comisión Nacional de Áreas Protegidas” (CONANP).

DNA Deposition

The following information was supplied regarding the deposition of DNA sequences:

Funding

The first author received a scholarship and financial support from the Consejo Nacional de Ciencia y Tecnología (CONACyT, grant no. 210335). This work constitutes part of the doctoral research of the frist author who received scholarship and financial support from the Consejo Nacional de Ciencia y Tecnologa (CONACyT, grant no. 210335). The project was supported by a grant from WWF-Alianza Carlos Slim. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Report a problem

Our promise
PeerJ promises to address all issues as quickly and professionally as possible. We
thank you in advance for your patience and understanding.

Type of problem

Details

500 characters remaining

Follow this publication for updates

"Following" is like subscribing to any updates related to a publication.
These updates will appear in your home dashboard each time you visit PeerJ.

You can also choose to receive updates via daily or weekly email digests.
If you are following multiple publications then we will send you
no more than one email per day or week based on your preferences.

Note: You are now also subscribed to the subject areas of this publication
and will receive updates in the daily or weekly email digests if turned on.
You can add specific subject areas through your profile settings.