This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http:// http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Aeromonas spp. are versatile bacteria that exhibit a wide variety of lifestyles. In an attempt
to improve the understanding of human aeromonosis, we investigated whether clinical
isolates displayed specific characteristics in terms of genetic diversity, population
structure and mode of evolution among Aeromonas spp. A collection of 195 Aeromonas isolates from human, animal and environmental sources was therefore genotyped using
multilocus sequence analysis (MLSA) based on the dnaK, gltA, gyrB, radA, rpoB, tsf and zipA genes.

Results

The MLSA showed a high level of genetic diversity among the population, and multilocus-based
phylogenetic analysis (MLPA) revealed 3 major clades: the A. veronii, A. hydrophila and A. caviae clades, among the eleven clades detected. Lower genetic diversity was observed within the
A. caviae clade as well as among clinical isolates compared to environmental isolates. Clonal
complexes, each of which included a limited number of strains, mainly corresponded
to host-associated subsclusters of strains, i.e., a fish-associated subset within
A. salmonicida and 11 human-associated subsets, 9 of which included only disease-associated strains.
The population structure was shown to be clonal, with modes of evolution that involved
mutations in general and recombination events locally. Recombination was detected
in 5 genes in the MLSA scheme and concerned approximately 50% of the STs. Therefore,
these recombination events could explain the observed phylogenetic incongruities and
low robustness. However, the MLPA globally confirmed the current systematics of the
genus Aeromonas.

Conclusions

Evolution in the genus Aeromonas has resulted in exceptionally high genetic diversity. Emerging from this diversity,
subsets of strains appeared to be host adapted and/or “disease specialized” while
the A. caviae clade displayed an atypical tempo of evolution among aeromonads. Considering that
A. salmonicida has been described as a genetically uniform pathogen that has adapted to fish through
evolution from a variable ancestral population, we hypothesize that the population
structure of aeromonads described herein suggested an ongoing process of adaptation
to specialized niches associated with different degrees of advancement according to
clades and clusters.

Background

Aeromonads are ubiquitous free-living organisms found in aquatic environments with
a strong ability to quickly colonize an exceptionally wide variety of habitats and
hosts, ranging from hostile environments, such as polluted or chlorinated water, to
leeches, insects, fish, mollusks, and mammals, including man [1]. They are opportunistic pathogens involved in various types of infections in a wide
range of hosts. This versatility is supported by a large variety of genes involved
in metabolic fitness and virulence; thus, Aeromonas hydrophila is referred to as a “jack-of-all-trades” [2]. Despite the adaptability of A. hydrophila, very few mobile genetic elements, which are usually associated with rapid adaptation,
have been found in the complete genomic sequence of the pathogenic strain A. hydrophila ATCC 7966T[2].

Additionally, because some hosts may only be either colonized or infected, the concept
that only specific subsets of Aeromonas strains within species might actually be pathogenic for humans was proposed [3,4]. In this setting, the question has arisen of whether isolates causing infectious
diseases are exceptional and can be distinguished from other strains. Comparative
analyses including environmental and clinical isolates showed that clinical strains
are well differentiated from strains collected in the environment based on multilocus
enzyme electrophoresis (MLEE) [5]. Other studies employing phenotypic, genotypic and virulence analyses have failed
to distinguish isolates involved in infectious diseases from those that are not [3,6-8]. However, this situation is complex because the pathogenesis of Aeromonas infection is multifactorial and is associated with multiple sources of variability
(e.g., a wide variety of virulence factor genes and the influence of environmental
conditions); these studies are usually limited either by the sampling strategy applied
(e.g., including a low number of isolates, species or types of infection), incomplete
virulence factor analyses, or an absence of virulence gene expression analysis.

Overall, in the case of generalist opportunistic pathogens, which do not meet all
of the criteria Koch’s postulate, the link between virulence-related genes and infection
is not clearly established, and this opportunistic pathogenic behavior may instead
be considered to represent an adaptation to human ecology [9-11]. There is evidence that genetic clusters can correspond to ecologically distinct
populations and/or host-adapted populations, even when genes that are not related
to virulence are considered [9,11-14].

In this context, in an attempt to improve the understanding of human aeromonosis,
we investigated whether clinical isolates displayed specific characteristics among
a large population of Aeromonas spp. from various origins. Because the 3 main Aeromonas species recovered from human clinical infectious diseases are A. caviaeA. hydrophila and A. veronii biovar sobria, we particularly focused on isolates belonging to these 3 taxa. The aim of this work
was to determine the genetic characteristics, population structure and mode of evolution
in a large population of aeromonads using a comparative approach that examined human,
non-human animal and environmental strains. For this purpose, we developed a multilocus
sequence analysis (MLSA) scheme specific for aeromonads, representing the third MLSA
scheme to be described for this genus [15,16]. This strategy provided 4 new genes and produced new information on the mode of evolution,
recombination rates and horizontal gene transfer in these species. This study, which
was based on a large human clinical strain collection, provides interesting insight
regarding the mode of evolution of aeromonads linked with human infection.

Methods

Bacterial strains

A total of 195 strains of Aeromonas spp., including 62 type and reference strains, were analyzed. The distribution of
the origin of these strains was as follows: 115 human clinical strains, 39 non-human
animal strains and 41 environmental strains (Table 1). Of the 115 human clinical isolates, 67 and 7 isolates were sampled in 2006 during
a prospective study on aeromonosis involving 70 hospitals from mainland France and
overseas territories, respectively [17]; 3 and 6 additional strains were isolated prior to 2006 from mainland France and
overseas territories, respectively; 19 isolates were recovered from patients admitted
to the Montpellier University Hospital between 2008 and 2010; and 5 strains were isolated
in the USA, 2 in India, 2 in Taiwan, 1 in Bangladesh, 1 in New Zealand, 1 in Switzerland
and 1 in the Czech republic. Clinical data were collected with the aim of specifying
the clinical implications of the strains, i.e., whether they were involved in an infectious
process or in host colonization. Of the 80 non-human isolates, 33 were isolated in
France from a wastewater treatment lagoon (n = 11), recreational lake water (n = 13),
healthy snails (n = 7) or fish (n = 2), and 36 strains were isolated from diverse
sources among 13 countries worldwide, mainly in Europe and North America, while the
countries of origin of 11 strains were unknown. The strain collection included type
strains of 28 species and a representative strain of hybridization group (HG) 11 (Table
1).

Table 1. Typing data and origin of Aeromonas strains analyzed in the study

Genomic DNA was prepared in agarose plugs as previously described [18] starting from a fresh culture on Trypticase Soja agar medium. After Aeromonas suspensions in 2 ml of Tris-NaCl buffer (1.0 M Tris base, 1.0 M NaCl, pH 7.6) were
adjusted to an optical density of 1.5 at 650 nm, they were centrifuged (10,000 g for
1 min), 1 ml of the supernatant was then discarded, and the pellet was resuspended
(final concentration 2:1). DNA was digested at 25°C with 40 U of SwaI (New England BioLabs, Hertfordshire, United Kingdom). The SwaI fragments were separated in a 1% agarose gel via PFGE using a CHEF-DRIII apparatus
(Bio-Rad Laboratories, Hercules, CA) and 0.5X Tris-Borate-EDTA (TBE) buffer containing
50 μM thiourea at 5.5 V/cm and 10°C with pulse ramps of 100 to 5 s for 48 h. A lambda
concatemer (Biolabs) was used as the size standard. The gel was stained with ethidium
bromide and photographed under UV light. The PFGE profiles, known as pulsotypes, were
compared visually by numbering both the shared and the distinct DNA fragments.

Gene amplification and sequencing

The complete genomic sequences of A. hydrophila subsp. hydrophila ATCC 7966T and A. salmonicida subsp. salmonicida A449 [GenBank accession numbers NC_008570 and NC_009348, respectively] were used
employed as references for gene selection and primer design. The primers used in this
study are described in Table 2. Genomic DNA was obtained using the Aquapure DNA extraction kit (EpiCentre, Madison,
WI). PCR was carried out in a 50 μL reaction mixture containing 200 nM of each primer
(Sigma Genosys), 200 μM of each deoxynucleoside triphosphate (dNTP) (Euromedex, Mundolsheim,
France), 2 mM MgCl2, and 2.5 U of Taq DNA polymerase (Promega, Madison, WI) in the appropriate reaction buffer and 50 ng
of genomic DNA as the template. The amplification conditions were as follows: initial
denaturation for 4 min at 94°C, followed by 35 amplification cycles as indicated in
Table 2 and a final extension step at 72°C for 10 min. zipA amplification required specific conditions for some A. caviae and A. media isolates included in this study, such as a 4 mM MgCl2 concentration and a primer hybridization temperature of 50°C (A. caviae).

The PCR products and a molecular weight marker (phage phiX DNA digested with HaeIII, New England BioLabs) were separated in 1.5% agarose gels in 0.5X TBE buffer.
The products were then sequenced using forward amplification primers (Table 2) in an ABI 3730XL automatic sequencer (Beckman Coulter Genomics, United Kingdom).

Phylogenetic analysis

Gene sequences were codon aligned using the ClustalW application within the Bioedit
Sequence Alignment Editor [19]. The sizes of the codon-aligned sequences that were used for further analyses are
indicated in Table 2. Phylogenetic analyses were performed for each of the 7 gene sequences and for a
manually concatenated sequence. Gaps in concatenated sequences were deleted with Bioedit.
The sequences were converted in Phylip format using the ReadSeq online program (http://searchlauncher.bcm.tmc.edu/seq-util/readseq.htmlwebcite). A distance-based phylogenetic tree was reconstructed using the neighbor-joining
method implemented in Neighbor from the PHYLIP package v3.66 [20]. Distances were calculated using FastDist. The K2P substitution model was selected
for the analysis plus a gamma parameter fixed to 2. Distance bootstrap support was
calculated after 1000 reiterations. The analysis was performed online on the site
http://www.phylogeny.frwebcite. For the maximum likelihood (ML) method-based phylogeny, evolutionary distance was
analyzed with the PhyML v3.0 program [21] using GTR, with a gamma distribution parameter estimated from the dataset and invariant
sites as a substitution model. This substitution model was determined to be the most
appropriate by ModelTest [22]. ML bootstrap support was calculated after 100 reiterations.

Multilocus sequence analysis

For each locus, each allele was assigned a distinct arbitrary number using a nonredundant
database program available at http://www.pubmlst.orgwebcite. The combination of allele numbers for each isolate defined the sequence type (ST).

Allele profiles were analyzed using eBURST v3 software [23] to determine the clonal complexes (CCs) defined as sets of related strains that share
at least 5 identical alleles at the 7 loci. A complementary eBURST analysis was conducted
to determine the CCs sharing at least 4 identical alleles at the 7 loci.

The program LIAN 3.5 [24], available at http://www.pubmlst.orgwebcite, was used to calculate the standardized index of association (sIA) to test the null
hypothesis of linkage disequilibrium, the mean genetic diversity (H) and the genetic
diversity at each locus (h). The number of synonymous (dS) and non-synonymous (dN)
substitutions per site was determined from codon-aligned sequences using Sequence
Type Analysis and Recombinational Tests Version 2 (START2) software [25]. Other genetic analyses, including the determination of allele and allelic profile
frequencies, mol% G + C content and polymorphic site numbering, were also carried
out using START2 software. A distance matrix in nexus format was generated from the
set of allelic profiles and then used for decomposition analyses with SplitsTree 4.0
software [26]. Recombination events were detected from the aligned ST concatenated sequences using
the RDP v3.44 [27] software package with the following parameters: general (linear sequence, highest
P value of 0.05, Bonferroni correction), RDP (no reference, window size of 8 polymorphic
sites, 0-100% sequence identity range), GENECONV (scan triplets, G-scale of 1), Bootscan
(window size of 200 bp, step size of 20 bp, 70% cutoff, F84 model, 100 bootstrap replicates,
binomial P value), MAxChi (scan triplets, fraction of variable sites per window set to 0.1),
CHIMAERA (scan triplets, fraction of variable sites per window set to 0.1) and Siscan
(window of 200 bp, step size of 20 bp, use 1/2/3 variable positions, nearest outlier
for the 4th sequence, 1000 P value permutations, 100 scan permutations).

Other statistics

All qualitative variables with the exception of the sIA were compared using a Chi-squared
test or the Fisher's exact test where appropriate; a P value ≤0.05 was considered to reflect significance. All computations were performed
using R project software (http://www.r-project.orgwebcite).

Phylotaxonomics

The population structure was inferred from multilocus phylogenetic analysis (MLPA)
following reconstruction of the distance and ML trees from the concatenated sequences
(alignment length of 3993 nt). The ML concatenated tree is shown in Figure 1, and the differences in the relative branching order between methods are also highlighted
in Figure 1. The Aeromonas population was organized into 11 clades, which included 2 to 71 strains, with three
major clades being observed (bootstrap values ≥ 90). The largest clade was comprised
of 71 isolates, including 46 human, 5 animal and 20 environmental isolates, among
which 4 were reference strains and three were type strains: A. culicicola CIP 107763T, A. ichthiosmia CECT 4486T, A. veronii biovar sobria LMG 13067 and A. veronii biovar veronii CECT 4257T; this was designated the A. veronii clade (Figure 1, Table 1). The two other major clades included 35 and 34 strains, respectively. They were
referred to as the A. hydrophila clade (including strains A. hydrophila subsp. hydrophila CECT 839T, A. hydrophila subsp. ranae CIP 107985 and 33 other isolates) and the A. caviae clade (including A. caviae CECT 838T, A. hydrophila subsp. anaerogenes CECT 4221 and 32 other isolates), respectively. Each of these clades contained strains
from various sources, i.e., 25 human, 7 animal and 3 environmental strains in the
A. hydrophila cluster and 24 human, 9 environmental and 1 animal isolate in the A. caviae cluster (Figure 1, Table 1). The remaining strains were distributed among eight minor clades (bootstrap values ≥ 90),
and are presented in Table 1 and Figure 1. The relative branching order among clades remains uncertain for most nodes (Figure
1). The clades displayed a mean sequence divergence of 2.5%, but the A. media clade displayed higher genetic polymorphism than the other clades (5.8%). None of
the isolates included in this study grouped with the type strains A. bestiarum, A. diversa, A. encheleia, A. enteropelogenes, A. eucrenophila, A. fluvialis, A. popoffi, A. sanarellii, A. schubertii, A. taiwanensis, and A. trota, or with the representative strain of hybridization group 11. Finally, strain CCM
1271 formed an independent phylogenetic branch that was clearly differentiated from
related known species, particularly from A. bestiarum, the species name under which the strain is referenced in the Czech Collection of
Microorganisms (Figure 1). A phylogenetic tree reconstructed for all the strains included in this study using
a concatenated sequence of the 5 genes obtained for all of the strains also showed
strain CCM 1271 to be unrelated to A. bivalvium CECT 7113T, A. molluscorum CIP 108876T, A. simiae CIP 107798T and A. rivuli DSM 22539T (see Additional file 1: Figure S1).

Additional file 1.Figure S1. Unrooted maximum-likelihood tree based on concatenated sequences of five housekeeping
gene fragments (gltA, gyrB, rpoB, tsf, zipA, 2724 nt). The horizontal lines indicate genetic distance, with the scale bar indicating
the number of substitutions per nucleotide position. The numbers at the nodes are
support values estimated with 100 bootstrap replicates. Only bootstrap values > 70
are shown on the tree. The clades defined in Table 1 are indicated with brackets at
the top right of the figure. Only type strains and reference strains are represented
in the tree.

Figure 1.Unrooted maximum-likelihood tree based on concatenated sequences of the seven housekeeping
gene fragments (3993 nt). The tree shows the structure of the studied Aeromonas spp. population, and the relative placement of human (red font), non-human animal
(black font) and environmental (blue font) strains was indicated. The horizontal lines
represent genetic distance, with the scale bar indicating the number of substitutions
per nucleotide position. The numbers at the nodes are support values estimated with
100 bootstrap replicates. Only bootstrap values > 70 are indicated on the tree. The
roots of the clades defined in 1 are represented by bold lines. MLPA clusters of strains sharing identical STs or
grouped into CCs sharing at least 4 identical alleles at the 7 loci are indicated
by frames (red frames for clusters of human strains, grey frames for clusters of non-human
animal strains and uncolored frames for clusters of strains of various origins). In
these frames, the following characteristics are indicated from left to right: (i)
the strain’s clinical involvement when applicable as Inf for infection and Col for
colonization; (ii) the SwaI pulsotype of the strains, with strains of identical pulsotypes designated by the
same letter, strains with pulsotypes sharing more than 85% of their DNA fragments
by A, A1, A2, … and strains with pulsotypes sharing no more than 70% of their DNA
fragments by distinct letters, i.e., A, B, C, …; (iii) the names of STs shared by
several strains; (iv) the names of CCs sharing at least 5 identical alleles at the
7 loci; and (v) the names of CCs sharing at least 4 identical alleles at the 7 loci.
These ST and CC names are indicated to the right of the brackets grouping the strains
with identical STs or belonging to the same CC and are followed by the bootstrap value
(indicated in parentheses) supporting the corresponding MLPA cluster. (*) indicates
that the relative position of the corresponding branch varied according to the method
used. ND, not determined.

The multilocus sequence-based phylogeny supported the current taxonomy of the genus.
In addition, both the high level of concatenated sequence divergence observed in the
A. media cluster and the comparison of the subtree topology for clusters including closely
related known species, such as A. eucrenophila-A. encheleia-A. tecta, suggested that the A. media clade may constitute a polyphyletic cluster containing taxa that have yet to be described.
Strain CCM 1271, showing a clearly segregated phylogenetic position in the MLPA, also
likely represents an unknown Aeromonas taxon.

Genetic diversity

The number of different alleles for the 7 loci varied from 111 (rpoB) to 160 (dnaK) (Table 3). This significant variation (P value = 10-9) suggested distinct mutation rates among the loci. The equivalent mol% G + C content
ranged from 55.8 (tsf) to 62.6% (radA) for all loci with the exception of zipA, which exhibited a lower mol% G + C content of 52.4%. The mean genetic diversity
among strains was high for the whole genus, and of the 3 main clades, A. caviae displayed the lowest genetic diversity (h) for all genes (Table 3). The rate of polymorphic sites varied significantly between the A. caviae, A. hydrophila and A. veronii clades for all loci except for rpoB, with A. caviae being the clade that showed the lowest number of polymorphic sites for all loci (Table
3). The rates of non-synonymous versus synonymous substitutions (dN/dS ratio) were
rather low, except for zipA, indicating that among the 7 loci, only one locus had been subjected to strong positive
selective pressure (Table 3).

Multilocus sequence typing, genomic relationships and origin of the strains

The multilocus sequence dataset for the 191 strains contained 175 sequence types (STs),
164 (93.7%) of which were identified only once. ST diversity was 0.92 per strain for
the genus Aeromonas, confirming its exceptionally high level of population diversity, which was also
observed in the A. caviae, A. hydrophila and A. veronii clades, which exhibited 0.97, 0.86 and 0.87 ST per strain, respectively. The largest
ST group included 6 strains of the A. veronii clade. A total of 10 other STs were shared by a maximum of 3 strains (Table 1, Figure 1).

The clustering of STs in CCs sharing at least 5 identical alleles at the 7 loci revealed
9 CCs, which grouped a maximum of 3 strains. These CCs corresponded to MLPA clades
supported by high bootstrap values ≥ 92%, except for CC “6” (Figure 1, Table 1). Using a less stringent definition of CCs (4 identical allele at the 7 loci) did
not significantly change the population clustering, confirming that the high genetic
diversity of the population was equally expressed at each locus (Table 1, Figure 1 and 2).

Links among strains sharing the same ST and strains grouped into CCs were further
investigated by comparing their geographic origins and isolation dates and using PFGE.
The genomic macro-restriction digest with the endonuclease SwaI produced PFGE patterns that comprised of an average of 18 bands suitable for strain
comparison (data not shown). The strains grouped within each of these clusters showed
distinct pulsotypes and/or were of distinct geographic origin and, in some cases,
had been isolated over a long time period. For example, ST7 included strains BVH14
and CCM 2278, sharing more than 85% of their DNA fragments in the PFGE analysis, which
were isolated in France in 2006 and in California in 1963, respectively (Table 1, Figure 1). Of particular note, the largest ST found in this study, ST13, included 6 strains
with identical pulsotypes, despite being isolated in 2006 from distant sites (e.g.,
La Réunion Island in the Indian ocean and 2 distant regions in mainland France). Finally,
we observed that the type strains of A. salmonicida subsp. masoucida and A. salmonicida subsp. smithia purchased from the Collection of the Institut Pasteur showed identical STs and pulsotypes;
this questionable result should be considered with caution until a further control
analysis is performed in strains ordered from another collection.

Comparison of the overall diversity observed according to the origin of the strains
within the 3 main clades showed that the number of STs per strain differed significantly
between the groups of clinical and environmental isolates (0.875 and 1, respectively;
P value = 0.036). This difference also reached the level of significance among the
A. veronii group (P value = 0.049). A few robust clusters of strains were shown to group isolates from
the same host origin, which primarily grouped strains of human origin (Figure 1, Table 1). Out of the 43 geographically and/or genomically unrelated strains belonging to
the 3 main clades included in either identical ST groups or in clonal complexes, only
3 originated from the environment, and 2 were involved in animal diseases (Table 1 and Figure 1). This difference in the distribution of environmental/animal and human clinical
strains was statistically significant (P value = 5.10-4) for the 3 main clades and for the A. veronii (P value = 0.02) and A. caviae (P value = 0.05) clades.

Finally, a non-random distribution of strains was observed among the different CCs
according to their site of isolation and/or colonizing/pathogenic status. CC “C” grouped
3 out of the 5 non-pathogenic, colonizing A. caviae strains in the dataset, and this rate was significantly different from that of the
non-pathogenic A. caviae strains found outside of the CC (P value = 0.04) (Table 1, Figure 1). In contrast, some other clusters included strains involved only in infectious processes
(Table 1, Figure 1). Finally, the A. veronii ST13 cluster appeared to be associated with a particular type of disease, i.e., wound
infection. Indeed, 5 out of the 12 A. veronii strains in the dataset involved in wound infection were grouped into this cluster,
representing a frequency that was significantly different from the rest of the A. veronii population (P value = 0.0001).

Recombination events in the aeromonad population

The sIA value was 0.30 at the genus level, ranged from 0.15 to 0.42 at the clade level
and was significantly different from 0, indicating the existence of significant linkage
disequilibrium, showing that the studied Aeromonas population was not panmictic but clonal. Events of recombination among the clonal
population were then analyzed via RDP, decomposition analysis and phylogenetic incongruence.

Considering the recombination events detected using at least 4 methods of the RDP
software, 14 types of recombination events leading to 166 recombinant sequences were
detected among the population and are detailed in an additional table (Additional
file 2: Table S2). All but two loci (radA and rpoB) were affected by recombination events that occurred in 89 STs (50.9%). dnaK and gyrB were the most affected loci (4 events each, 75 and 13 recombinant sequences, respectively),
followed by tsf and zipA (3 events each, 73 and 5 recombinant sequences, respectively) and gltA (1 event and 3 recombinant sequences) (data not shown). One to four types of significant
recombination events occurred in most clades, except for the A. hydrophila, A. piscicola and A. tecta clades and the A. fluvialis type strain and strain CCM 1271. Five events could not be significantly linked to
parental sequences, suggesting the occurrence of transfer from strains that are not
represented in our collection.

Recombination was also investigated for the 3 main clades via split decomposition
in the concatenated sequences (Additional file 3: Figure S3 a-c). Most of the STs were not affected by recombination, and the trees
showed a limited parallelogram formation, notably including A. hydrophila STs (Additional file 3: Figure S3 b). Split decomposition detected more recombination events within the
CCs, particularly for STs in the A. caviae clade (Additional file 3: Figure S3 a).

Additional file 3.Figure S3. SplitsTree decomposition analyses of the MLSA data for strains belonging to theA. caviae(a), A. hydrophila(b) andA. veronii(c) clades. The distance matrix was obtained from the allelic profiles of the sequence
types (ST). A network-like graph indicates recombination events. Star-like radiation
from the central point indicates an absence of recombination. The names of eBURST
clonal complexes (CCs), as defined in the text and in Table 1, are indicated near
the corresponding STs. The number of strains sharing an identical ST is indicated
below the ST number in brackets. Type strain STs are indicated by dots.

Distance and ML trees were reconstructed for each of the 7 genes and compared to the
concatenated sequence-based trees. For all genes and phylogenetic methods, single
locus phylogenies (SLPA) displayed lower bootstrap values than MLPA trees (data not
shown). Moreover, differences in branching order were observed in SLPA, suggesting
the occurrence of recombination events (data not shown). In detail, phylogenetic discordance
was observed for 11 strains based on single-gene phylogenetic analysis: all of these
strains grouped in a robust cluster that was different from the cluster defined based
on the 6 other genes or the concatenated sequence (shown in bold text in Table 1). Identical alleles were observed in strains belonging to different MLPA clusters,
i.e., gyrB allele 83, common to the two environmental strains A. veronii strain AK250 and A. hydrophila strain AK218; zipA allele 97, common to the A. media and A. enteropelogenes type strains; and zipA allele 94, which was identical in the A. caviae type strain and A. salmonicida strain CIP104001 (Table 1). In addition, strain BVH53 belonged to the A. veronii clade in the MLPA, while it was robustly grouped with the A. jandaei type strain in the gyrB-based phylogeny (bootstrap value of 100% in both the ML and distance-based trees)
(data not shown). Similarly, among the isolated strains, the A. fluvialis type strain showed a divergent phylogenetic position between the gltA-based tree, where it robustly grouped with the A. schubertii type strain, and other gene-based phylogenies or the MLPA. Finally, strain BVH39
grouped within the A. salmonicida clade in the multilocus tree, while it was excluded from the corresponding clade
defined in the dnaK-based tree. These phylogenetic incongruities revealed a total of 12 recombination
events (0.9% of the sequences), which occurred in 11 strains (4, 3 and 4 strains of
human, animal and environmental origin, respectively) (5.8% of the total strains)
and concerned 5 out of the 7 genes addressed in our MLSA scheme, i.e., dnaK (1 strain), gltA (1 strain), gyrB (4 strains), tsf (3 strains), and zipA (3 strains) (Table 1). Multilocus phylogenetic trees reconstructed excluding the strains subjected to
recombination showed increased bootstrap values for the A. veronii clade (90 to 100%) as well as for most interclade nodes, confirming that recombination
distorted the MLPA (data not shown). Despite its relatively low frequency of occurrence
in the genus Aeromonas, recombination may account, at least in part, for some controversial taxonomic issues.
For example, strain CCM 1271 is closely related to A. bestiarum in the gyrB-based phylogenetic tree (data not shown), whereas it is clearly individualized from
this species in the MLPA.

Discussion

In this study, we investigated the genetic diversity and population structure linked
with strain origin using MLSA. This work, based on a large collection of clinical
isolates, i) estimated high genetic diversity, recombination rates and horizontal
gene transfer in Aeromonas species, which together highlighted the mode of evolution in this group; ii) showed
small clusters of strains associated with human infection; and iii) provided phylotaxonomic
data that helped clarify the confusing taxonomy of the genus Aeromonas in relation to other works [15,16,28]. These results are further discussed below.

A MLSA scheme for studying Aeromonas spp. population structure

This was the 3rd multilocus scheme proposed for studying Aeromonas spp. in 2011 [15,16]. These three studies analyzed different populations of aeromonads with different
set of genes and different objectives. The 1st MLSA scheme was developed for analyzing Aeromonas phylogeny and attempting to resolve the taxonomic controversies within this genus
[16]. The 2nd was developed to achieve precise strain genotyping and phylogenetic analysis of outbreak
traceability and genetic diversity and was based on strains isolated from fish, crustaceans
and mollusks [15]. The MLSA that we have presented here improved the understanding of human aeromonosis
by addressing a large population that included both clinical and environmental strains
from diverse geographic sources. The overall collection represented different lifestyles
encountered in the genus: free living or associated with humans or cold-blooded animals.
The clinical strain collection was representative of the French epidemiology because
it resulted from a systematic prospective nationwide record and was associated with
well-documented clinical reports [17]. The size of the collection was increased by including strains from various collections,
most of which came from animal and environmental sources, so that the overall collection
studied herein totaled 195 strains, which is a greater number compared to the two
other MLSA studies on Aeromonas[15,16]. Our MLSA scheme was suitable for analysis of the whole genus Aeromonas, with the exception of four species: A. bivalviumA. molluscorumA. simiae and A. rivuli, for which only 6 genes could be analyzed. This MLPA allowed structuring the population
into 3 main clades, designated A. veroniiA. hydrophila and A. caviae, because they contained the type strains of these species. Despite the fact that
the number of isolates in the main clades was high compared to the study by Martino
et al. [15] and similar to other studies [e.g., [29], the number of strains in some clades remained rather limited (e.g., A. caviae: 34 strains), and our results should be confirmed in a larger population. For this
purpose, the population results and MLSA scheme have been deposited in a public database
(PubMLST: http://pubmlst.org/software/database/bigdb/) [30]. Nevertheless, our results provided interesting insight into the genetic diversity
and structure of the Aeromonas population encountered in clinical infections as well as the mode of evolution of
this population.

The MLSA scheme presented here included 7 housekeeping genes, which were sporadically
distributed among the genome and, thus, can reasonably be assumed to not be associated
with mobile genetic elements. The 7 genes encoded components of different metabolic
pathways and were characterized by different mutation rates and low positive selective
pressure, indicating predominantly neutral evolution of the loci. In our MLSA scheme,
we propose the use of 4 genes not previously included in the 2 other MLSA schemes
(dnaKradAtsf and zipA) for inferring the Aeromonas population structure. Despite its markedly lower mol% G + C content compared to other
loci and to the mean value for the genus Aeromonas inferred from the A. hydrophila ATCC 7966TA. salmonicida A449, and A. caviae Ae398 genomes (approximately 61.5%) and the A. veronii B565 genome (58.8%) [2,31-33], zipA was included in the MLSA scheme because the zipA-based phylogenetic tree was mostly congruent with those obtained for other genes
(data not shown). This suggested that this gene likely originated from a distant genus
through lateral genetic transfer, though it was likely acquired by a common ancestor
of the population studied. Altogether, the 3 available multilocus schemes for Aeromonas indicated 16 distinct genes, i.e., atpD, dnaJ, dnaK, dnaX, gltA, groL, gyrA, gyrB, metG, ppsA, radA, recA, rpoB, rpoD,
tsf, and zipA, that will offer the possibility of performing accurate analyses in Aeromonas.

Mode of evolution

Both the genetic and ST diversity per strain were observed to be exceptionally high
in the genus Aeromonas and were much higher than observed for many other environmental bacteria [9,11,34]. Although strains from countries other than France only represented approximately
25% of the total strains of our dataset, the high level of genetic diversity observed
validated the non-redundancy and representativeness of our population. Given that
some geographically distant strains were very closely related (e.g., BVH 14 and CCM
2278, (Figure 1 and Table 1)), the global genetic diversity of this group may be reflected in that of a rather
small sampling population, as observed in other reports on water-living species with
high genetic diversities, such as Pseudomonas aeruginosa[35]. However, further analysis will be required to confirm this hypothesis. High diversity
was observed in the 3 main A. caviae, A. hydrophila and A. veronii clades; however, the genetic characteristics and population structure of the A. caviae clade were outstanding. Compared to the other two mains groups, the A. caviae clade showed a lower genetic diversity, indicated both by its genetic diversity (h)
and lower number of polymorphic sites. However, A. caviae exhibited the highest dN/dS ratio for 4 genes. These results suggested that A. caviae strains have experienced less genetic variation, but when such variations have occurred,
the mutations have more often been non-synonymous. Positive selective pressure or
genetic hitchhiking is unlikely to explain this phenomenon. In this context, the occurrence
of deleterious mutations linked to demographic effects experienced by the population
represents a hypothesis that can explain the genetic particularities of A. caviae. The high genetic diversity in the genus, as observed by other researchers as well
[15,36] reflects the behavior of aeromonads as water-living bacteria. In fact, Aeromonas represent an outstanding example of generalist bacteria displaying genetic and genomic
traits associated with this lifestyle and their ability to adapt to diverse niches,
i.e., a relatively large genome (4.7 Mb) [2], high genetic diversity, significant rate of horizontal gene transfer of housekeeping
genes (5.8% in our population), a significant number of ribosomal operons that are
sometimes heterogeneous and submitted to cross-over events [37-39], genomic and phenotypic plasticity [2] and a great ability to adapt to new niches. All of this diversity corresponded to
structuring in terms of complexes of species rather than species sensu stricto[40]. The wide range of genetic repertoires included in these complexes of species may
constitute a potential reservoir for the emergence of future specialists via a speciation
process related to selective pressure within a narrow niche. The complex and confusing
systematics of the genus Aeromonas may result, at least in part, from the structure in species complexes in which speciation
is progressing locally. For example, the species status of A. allosaccharophila, a clade closely related to A. veronii, has long been controversial, and evidence indicating whether this represents a definitive
species has varied according to the methods used and the housekeeping genes analyzed
[16,28,41-45]. If speciation is currently in progress for A. allosaccharophila, it could explain these controversial data, as highlighted by Silver [28] or as observed in other genera (e.g., the Burkholderia cepacia complex [46]). In contrast, A. salmonicida could represent an example of a fish-adapted species subjected to some costs of specialization
(e.g., being non-motile, having the ability to growth at 25°C but not at 37°C) [33]. In this study, A. caviae appeared to have exceptional genetics compared to A. veronii or A. hydrophila. The hypothesis of a population bottleneck related to adaptation to a specialized
niche, such as the gut, which is a more frequent niche for A. caviae compared to other aeromonads, should be emphasized. In fact, compared to A. veronii and A. hydrophila, A. caviae is preferentially found in the gut, as highlighted by the higher frequencies of gastroenteritis
and bacteremia infections originating from the gut [17] and the higher density of A. caviae in wastewater inflows than outflows [47,48].

The diversity was significantly lower among the clinical strains than the environmental
strains, which is compatible with the hypothesis that some clinical strains may correspond
to niche-adapted subpopulations. Robust MLPA clusters of strains with identical STs
or belonging to CCs were identified among the population, mainly among the 3 main
clades this study. Each of these clusters included a limited number of strains (2
to 6 strains) that were further shown to be unrelated based on epidemiological data
and/or PFGE results, and 52 out of the 191 fully analyzed strains (27.2%) were involved
in these clusters. Twelve clusters grouped strains from a unique host, i.e., a fish-associated
subset within A. salmonicida and 11 human-associated subsets within the A. veronii (n = 6), A. caviae (n = 3) and A. hydrophila (n = 2) clades. Nine of these subsets included only disease-associated strains. Notably,
all of the A. veronii human-associated clusters were disease associated. Among these clusters, ST13, which
was shared by 6 strains of human origin and was mainly recovered during wound infections,
may reflect a host (niche)-adapted pathogenic cluster among the A. veronii clade, which was otherwise characterized by high genetic diversity. The striking,
unique PFGE pattern and ST may reflect the adaptation of this cluster to a niche in
which genetic and genomic variability is not permitted due to strong constraints.
However, because of the small number of strains included in these clusters, an increased
number of strains should be studied to confirm whether specific lineages or CCs are
more likely to contain clinical isolates or be associated with a specific illness.

The present study showed a relatively low frequency of recombination events compared
to previous studies [15,28]. This result may originate in the differences between these studies in the genes
evaluated and the population sampling strategies employed. The population sample studied
by Martino et al. differed significantly from ours, as most of their isolates were
obtained from fish, crustaceans or mollusks [15]. Silver et al. deliberately included a very small number of isolates (n = 12) of
host-associated strains (e.g., only strains isolated from leeches, human wounds or
human feces), which may constitute a recruitment bias because these strains may be
host adapted [28]. Interestingly, the recombination events encountered in our study were predominantly
observed within clonal complexes (e.g., CC “D”, grouping A. veronii strains recovered during human diarrhea episodes), which supported the previous hypothesis
of the study by Silver et al. [28].

Taxonomic considerations

MLPA may be helpful for identification purposes. Indeed, strains that have previously
rarely been reported in the literature were recognized among the study population,
among which an A. jandaei isolate from a human urinary tract infection and an A. allosaccharophila strain recovered during human bacteremia were particularly remarkable. Moreover,
MLPA may allow the correct identification of strains deposited in strain collections
under erroneous or incomplete nomenclature, as observed for A. sobria CECT 4333 and Aeromonas sp. CECT 5177, which most likely belong to the A. piscicola species.

Multilocus sequence-based phylogeny supported recent taxonomic changes in the genus
Aeromonas. First, several recently characterized species were clearly individualized in the
7 gene-based phylogenetic trees, such as A. taiwanensisA. sanarellii and A. fluvialis[49,50]. The proposal of A. diversa, including Aeromonas sp. HG13, referred to as Aeromonas group 501, as a distinct species from A. schubertii was supported in the MLPA by the clearly individualized phylogenetic positions observed
for these two species [51]. Moreover, several taxonomic reappraisals were confirmed by our approach, as observed
and discussed in the MLPA study by Martinez-Murcia et al. [16,52]. In addition, evidence previously suggesting that A. hydrophila subsp. anaerogenes and A. caviae are conspecific was confirmed here by the A. hydrophila subsp. anaerogenes strain CECT 4221 that was found to belong to the A. caviae clade [53]. All of these observations showed that the MLSA scheme appeared to be a strongly
informative tool that should be included within the methods used for polyphasic taxonomic
analysis in the genus Aeromonas. Thus, this MLSA scheme may provide additional arguments regarding controversial
issues recently reviewed by Janda & Abbott [1]. A. ichthiosmia, which is considered to be a later synonym of A. veronii[42], clearly grouped in the A. veronii clade. A. encheleia showed a low level of genetic divergence at the 7 loci and grouped in a tight and
robust clade with HG11, providing additional arguments for their unification. A. allosaccharophila, whose existence is still controversial, occupies a robust position that is closely
related, but external to the A. veronii clade, in favor of the separation of the two taxa. However, the taxonomic level of
the new taxon, if proposed, still has to be defined due to conflicting DNA-DNA hybridization
values compared to the A. veronii type strain according to the study considered [42,54]. Finally, the A. caviae type strain occupies a position external to those of other members of the A. caviae clade in the MLPA-based tree. This observation warrants further investigation due
to the taxonomic value of the MLSA scheme demonstrated here. Of note, only 2 genes
(gyrB and rpoB) from A. sharmana, a species that was shown not to belong to the genus Aeromonas and is awaiting reassignment, could be amplified using the primers employed in this
study [55,56].

Conclusions

Evolution in the genus Aeromonas has involved the combined effects of mutations and recombination events, resulting
in an exceptionally high genetic diversity. We propose a hypothetical mode of evolution
in aeromonads based on global organization into a complex of species, with local emergence
of more specialized clones. This specialization in process is suggested by the co-existence
of i) specialized species sensu stricto, such as A. salmonicida, ii) clades showing genetic traits associated with speciation and adaptation to a
specific niche, such as A. caviae, and iii) diverse subsets of strains that may be host adapted and/or “disease specialized”.
The MLSA scheme developed herein in a large and diverse population of strains helped
shed light on the unclear relationships among Aeromonas strains and aeromonosis. However, certain clades and the host- and/or disease-associated
subsets of strains detected in this study included a limited number of strains. As
a consequence, additional studies are required to increase the size of the analyzed
population and to confirm these results. Further work including a virulence analysis
focusing on human clinical clusters is also needed. Finally, the MLSA scheme proposed
here appeared to be useful for taxonomic studies in the genus Aeromonas.

Authors’ contributions

Conceived and designed the study: EJB, HM, BL. Designed and performed the acquisition
of clinical data and isolate collection: colBVH, AK, BL. Performed the microbial and
molecular genetic analyses: FR (primer design, MLSA and MLPA, PFGE), AK (curator of
the clinical isolates collection, rpoB analysis). Analyzed and interpreted the data: FR, BL (all data), HM (PFGE and MLPA),
EJB (MLSA), BL (statistics). Drafted the paper: HM, BL. Helped to draft the manuscript:
FR. Critically revised the manuscript: EJB. All authors read and approved the final
manuscript.

Acknowledgments and funding

We are particularly indebted to the microbiology laboratory team of the Montpellier,
France, academic hospital for providing some clinical isolates. This work was supported
by the Association des Biologistes de l’Ouest, by the Laboratoire de Diagnostic Bactériologique
de l’Ecole Nationale Vétérinaire de Lyon and by ADEREMPHA (Association pour la Recherche
et le Développement en Microbiologique & Pharmacie).

Saavedra MJ, Perea V, Fontes MC, Martins C, Martínez-Murcia A: Phylogenetic identification of Aeromonas strains isolated from carcasses of pig as
new members of the species Aeromonas allosaccharophila.