Monday, April 2, 2012

Our group recently addressed the effect of within-colony genetic diversity on the associated microbial community of the honeybee Apis melliera [1]. We obtained more than 70,000 pyrosequences from samples of whole worker bees, worker guts, and from bee bread taken from 22 colonies (n= 12 colonies were genetically diverse; n=10 colonies were genetically uniform). Our research found that the honey bee colonies benefit from the promiscuous mating of queens; diverse colonies were characterized by a reduction in potential pathogens and enrichment for possible probiotic species. We used well established approaches for clustering (based on 97% sequence identity using average neighbor) and de novo classifying short pyrosequences [2-5] as these sequences are arguably too short for robust phylogenetic analyses [6]. Our approach used the Naïve Bayesian Classifier trained on the Arb-Silva dataset and targeted diversity in the V1-V2 region. The utility of this approach is that we could, through alignment of the 16S rRNA gene, make hypotheses as to what these organisms in the bee gut may be doing, how they might be interacting, without a priori expectations of community composition.

In an important earlier study in 2007, Cox-Foster et al. published a phylogenetic framework for classifying the bacteria that are associated with honey bees [7]. The framework includes 8 phylotypes named Bifidobacterium, γ-1, γ-2, β-1, α-1, α-2, firm-4 and firm-5. Unlike our de novo approach, these phylotypes were generated based on their groupings on a phylogenetic tree. The method of analysis employed by Cox-Foster et al. differs from ours in that the diversity within each clade is not predetermined – that is, no % identity threshold is used in generating these groupings – and therefore taxa grouped into a single clade may be highly divergent (that is, below the traditional 97% identity threshold utilized in the field). We analyzed the sequences generated in the earlier study, and computationally formed % identity clusters to explore the amount of diversity within a clade by progressively clustering the sequences at higher divergence levels using complete linkage clustering (as implemented in RDP Classifier; Table 1) or nearest neighbor clustering (as implemented in blastclust –b T, -L 0.9). Indeed, each clade holds a relatively large amount of diversity; sequences within each clade are between 3- and 10% divergent (Table 1). According to the methods utilized in Mattila et al. (2012), and using a 97% identity threshold that is more typical for the field, these clades would be considered to harbor numerous species and/or genera.

Table 1. The number of clusters generated by complete linkage clustering (or nearest neighbor clustering in parenthesis) of the 8 clades characterized in Cox-Foster et al.[7] as a function of percent identity. Subclusters within clades suggest that these groupings are quite diverse and likely contain several different species/genera.

Phylotype

90%

93%

95%

97%

Alpha-1

1 (3)

1 (3)

1 (3)

2 (3)

Alpha2-1

1 (2)

1 (2)

1 (2)

2 (4)

Alpha-2-2

1 (1)

1 (1)

3 (1)

5 (3)

Beta

2 (7)

2 (7)

3 (8)

5 (8)

Bifido

1 (2)

1 (2)

1 (2)

1 (2)

Firm-4

1 (2)

2 (2)

3 (4)

5 (5)

Firm-5

1 (2)

2 (2)

2 (2)

4 (2)

Gamma-1

2 (6)

2 (6)

4 (6)

5 (7)

Gamma-2

1 (3)

1 (3)

1 (3)

1 (3)

Below, we compare our data and our approach to that developed by Cox-Foster et al. [7] to contrast the level of diversity that is estimated by both approaches. We used blastn to identify which of our top 13 most prevalent OTUs (sequences that cluster at 97% identity) corresponded to the 8 clades mentioned above. If our top OTUs were 100% identical to a sequence representative considered to be part of one the 8 clades, we noted this in Table 2 below.

Table 2. Representative OTUs generated by Mattila et al., 2012, their % prevalence in the bee gut (ranked in terms of abundance) and top blast hit accession numbers from the Cox-Foster et al., 2007 phylotypes. Where a particular OTU does not find homologs within the 8 phylotype framework proposed by Cox-Foster et al., we indicate that result with N/A.

Classification (Mattila et al., 2012)

Prevalence in the bee gut (Mattila et al., 2012)

Top blast hit (nr/nt)

Phylotype

Succinivibrionaceae

38.8%

HE613303

Gamma-1

Bowmanella

14.3%

DQ837611

Gamma-2

Oenococcus

14.1%

HE613310

Firm-5

Paralactobacillus

10.2%

HM113331

Firm-4

Unclass. Colwelliaceae

6.4%

HM111973

Gamma-1

Bifidobacterium

4.7%

HM113282

Bifido

Shimazuella

3.2%

HE613282

Firm-5

Enterobacter

1.2%

JF208675

N/A

Laribacter

1.0%

JQ437500

Beta

Saccharibacter

0.92%

JQ437507

Alpha-2.1

Rummeliibacillus

0.52%

HM111947

Firm-5

Atopobacter

0.28%

HM113352

Firm-4

Escherichia/Shigella

0.17%

HE582599

N/A

It is important to note that the 13 OTU representatives in Table 2 do not represent our complete dataset; of the 1,019 OTUs we reported in Mattila et al., 2012, only 358/1019 find homologs (>98% ID) in honey bee datasets previously published. Furthermore, Mattila et al., 2012 takes the step of classifying and analyzing these sequences at a finer taxonomic scale. Two fundamental questions remain to be addressed: 1) is this diversity relevant to honey bee health? and 2) is the level of divergence revealed by our study for the honeybee microbiome important for the function and stability of this community? Our study suggests that fine scale diversity within the bacterial community (at 97% ID) may be important to the health of a colony; community diversity using this metric correlated with host genetic diversity and with the prevalence of low frequency pathogens Melissococcus and Paenibacillus.