Abstract

Organisms show great variation in ploidy level. For example, chromosome copy number varies among cells, individuals and species. One particularly widespread example of ploidy variation is found in haplodiploid taxa, wherein males are typically haploid and females are typically diploid. Despite the prevalence of haplodiploidy, the regulatory consequences of having separate haploid and diploid genomes are poorly understood. In particular, it remains unknown whether epigenetic mechanisms contribute to regulatory compensation for genome dosage. To gain greater insights into the importance of epigenetic information to ploidy compensation, we examined DNA methylation differences among diploid queen, diploid worker, haploid male and diploid male Solenopsis invicta fire ants. Surprisingly, we found that morphologically dissimilar diploid males, queens and workers were more similar to one another in terms of DNA methylation than were morphologically similar haploid and diploid males. Moreover, methylation level was positively associated with gene expression for genes that were differentially methylated in haploid and diploid castes. These data demonstrate that intragenic DNA methylation levels differ among individuals of distinct ploidy and are positively associated with levels of gene expression. Thus, these results suggest that epigenetic information may be linked to ploidy compensation in haplodiploid insects. Overall, this study suggests that epigenetic mechanisms may be important to maintaining appropriate patterns of gene regulation in biological systems that differ in genome copy number.

1. Introduction

Organisms display a remarkable diversity in ploidy level [1–5]. For example, all sexual organisms show variation in ploidy during their life cycle. In addition, members of different species sometimes vary in ploidy number. Such ploidy variation shapes molecular evolution, genetic interactions and gene function [5–9]. Thus, variation in ploidy fundamentally affects evolutionary and developmental processes.

A prime example of variation in ploidy is embodied by the haplodiploid genetic system. Haplodiploid species are typically characterized by having unfertilized eggs develop into haploid males and fertilized eggs develop into diploid females [3,4]. The haplodiploid genetic system has arisen at least 17 independent times during the course of animal evolution [4], and is the ancestral genetic system of the order Hymenoptera (ants, bees and wasps) [3]. Consequently, as many as 20% of all animal species may be haplodiploid [10]. Despite the taxonomic prevalence of haplodiploidy, the regulatory consequences of ploidy differences between sexes remain largely unknown (but see [6,9,11]). This lack of information represents a gap in our understanding of how biological systems respond to ploidy variation.

Epigenetic modifications to chromatin are prime candidates for regulating gene function in haplodiploid taxa. Epigenetic marks are heritable and make fundamental contributions to gene regulation [12]. One of the most important types of epigenetic marks is the methylation of DNA. DNA methylation is found in all three domains of life, suggesting a role in the common ancestor of all metazoa [13,14].

Recently, DNA methylation and histone modifications have been implicated in the regulation of social insect caste differences [15–19]. In addition, global sex chromosome dosage compensation is achieved in Drosophila and mammals by epigenetic mechanisms [20,21], demonstrating that distinct epigenetic states can achieve transcriptional compensation associated with ploidy variation. However, the contributions of epigenetic inheritance to regulatory mechanisms that compensate for ploidy differences in haplodiploids have not been investigated. In this study, we attempted to gain insights into whether epigenetic information was associated with gene regulation in haplodiploid taxa.

In order to assess the epigenetic states of haploid and diploid genomes, we compared single nucleotide resolution DNA methylation profiles (DNA methylomes) of haploid and diploid individuals of the red imported fire ant, Solenopsis invicta. Sex in S. invicta and many other hymenopteran insects is determined by complementary sex determination [3]. Under single-locus complementary sex determination, sex is controlled by zygosity at a single genetic locus. In this case, heterozygous individuals develop into females and hemizygous (haploid) individuals develop into males.

Interestingly, diploid individuals that are homozygous at the sex-determining locus develop into diploid males. Diploid males are generally rare in hymenopteran populations. However, diploid males are produced at high frequency in invasive S. invicta owing to loss of variation at the sex-determining locus [22]. S. invicta diploid males are larger than haploid males, but otherwise have highly similar morphologies and behaviours to haploid males. Moreover, haploid and diploid males differ substantially in phenotype from diploid queens and workers [23,24]. Importantly, the common production of haploid and diploid males makes S. invicta well suited to investigating epigenetic gene regulation in the context of ploidy differences while simultaneously controlling for sex differences.

Our analyses uncovered striking differences in DNA methylation between haploid and diploid individuals in S. invicta. The link between DNA methylation and ploidy variation suggests that haploid and diploid genomes in S. invicta exhibit distinct epigenetic states. These results provide support for the hypothesis that epigenetic mechanisms are associated with genomic dosage compensation of haplodiploid organisms. More broadly, our results suggest that epigenetic information may influence the evolution of ploidy differences among cells, organisms and species.

2. Material and methods

(a) Whole-genome bisulfite-sequencing

Sample collection, DNA extraction, bisulfite conversion, sequencing, quality control and read mapping were performed as described elsewhere [25]. Briefly, all samples were taken from a single S. invicta colony. Male ploidy was confirmed by DNA microsatellite analysis at three to four highly variable loci. Genomic DNA was separately pooled from whole bodies of haploid males, diploid males, alate queens and workers, comprising one sample per caste. We obtained between seven and nine times mean coverage of genomic CpG sites per sample [25]. Solenopsis invicta SI2.2.3 gene models were used for analysis of genes, exons and introns [26]. Solenopsis invicta whole-genome bisulfite-sequencing data are available online from Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo; GSE39959).

(b) DNA methylation targets and levels

Significantly methylated CpG sites were assessed using a binomial test, implemented using the Math::CDF module in Perl, which incorporated deamination rate (from our unmethylated control) as the probability of success, and assigned a significance value to each CpG site related to the number of unconverted reads (putatively methylated Cs) compared with the expected number from control [15]. Resulting p-values were then adjusted for multiple testing [27]. Only sites with false discovery rate (FDR)-corrected binomial p-values <0.01 and ≥3 reads were considered ‘methylated’. Fractional methylation values were calculated as described previously [25] for each CpG site or for each genomic feature (exons and introns).

(c) Hierarchical clustering and dendrogram generation

The pvclust package in the R statistical computing environment was used to generate clustering and dendrogram diagrams of fractional methylation values of exons and introns [28]. We used the ‘average’ linkage agglomeration method, the ‘correlation’ distance measure and 1000 bootstrap replications. Only those genomic features (exons and introns) targeted by DNA methylation in at least one caste, according to FDR-corrected binomial tests, were included in hierarchical clustering analysis. Fractional DNA methylation values of a given exon or intron in castes that did not exhibit significant DNA methylation were set to zero prior to hierarchical clustering in order to minimize noise contributed by unconverted, putatively unmethylated cytosines.

(d) Differential DNA methylation

Significantly differentially methylated features (exons and introns) were assessed for each pairwise comparison between castes using generalized linear models (GLMs), implemented in the R statistical computing environment [28], where methylation levels for features were modelled as functions of ‘caste’ and ‘CpG position’. If caste contributed significantly (chi-square test of GLM terms) to the methylation status of a feature (after adjustment for multiple testing using the method of [27]), then it was considered differentially methylated between castes [15]. Only CpG sites that were methylated in one or both castes and covered by at least four reads in both libraries were used in these comparisons, and only features with at least three such CpG sites were considered in further analyses.

Once exons and introns were assigned differential methylation status using the above GLM, each significantly differentially methylated exon or intron was called as elevated in the caste with higher fractional methylation status of that feature. These features were then combined by gene, and each gene was called as a unidirectional differentially methylated gene (DMG) if more than two-thirds of the gene's differentially methylated features were elevated in the same direction.

(f) Gene expression

Solenopsis invicta whole-body cDNA microarray data [30,31] were mapped to S. invicta gene models as described previously [25]. Expression ratios between queen, worker and haploid male castes [30] were calculated as , where C1 is the expression value estimated by BAGEL [32] for the first caste, and C2 is the estimated expression value for the second caste.

For each gene, we assessed the coefficient of variation (standard deviation/mean; CV) of expression values as the mean of CV values calculated separately for whole body S. invicta adult and pupal workers, queens and haploid males (median of five biological replicates per morph) [30].

(g) Coding sequence evolution

We used OrthoDB [35] 12-insect orthology data to assign single-copy orthologues between the ants S. invicta, Pogonomyrmex barbatus and Linepithema humile. Non-synonymous substitutions per non-synonymous site and synonymous substitutions per synonymous site were determined for the S. invicta lineage using codeml in PAML as described previously [36]. Genes with aligned sequence length ≤100, dS ≥ 4 or dN/dS ≥ 4 were filtered out prior to analysis.

3. Results

(a) DNA methylation is associated with ploidy in Solenopsis invicta

We observed significant differences in methylation level in one or more pairwise comparisons between castes for 3478 exons (32.7% of 10 628 exons methylated in one or more caste) and 577 introns (23.3% of 2479 introns methylated in one or more caste) in S. invicta. Ultimately, we classified any gene with a significant difference in the methylation level of at least one exon or intron in at least one pairwise comparison between castes as a DMG.

We found that DNA methylation levels in all libraries derived from diploid individuals were more similar to one another than to the library derived from haploid males (figure 1a). Diploid males, queens and workers all showed methylation profiles that were highly diverged from haploid males. In particular, the majority of significantly differential methylation occurred between the haploid and diploid castes (figure 1b; electronic supplementary material, figure S1). The pairwise comparison with the greatest number of DMGs was that between haploid and diploid males. This is particularly noteworthy given the high degree of morphological and behavioural similarity between haploid and diploid males in S. invicta [23,24]. The pairwise comparison with the fewest differences was that between queens and workers, both of which are diploid females (figure 1b; electronic supplementary material, figure S1). We note that these findings are unlikely to be the result of bisulfite conversion efficiency, as the queen library exhibited the highest unmethylated cytosine non-conversion rate, and haploid and diploid males had the most similar unmethylated cytosine non-conversion rate among all libraries (electronic supplementary material, table S1).

DNA methylation differs between haploid and diploid castes in S. invicta. (a) Dendrogram produced by hierarchical clustering of fractional methylation levels representing all introns and exons targeted by DNA methylation in at least one library (n = 10 560 genetic features); bootstrap probability values are shown. (b) Number of differentially methylated genes (DMGs) detected between castes. (c) Number of directional DMGs from panel (b) that exhibit pairwise elevated methylation in haploid and diploid castes. (Online version in colour.)

We next defined directional DMGs as those wherein at least two-thirds of differentially methylated features (exons and introns) were more highly methylated in one caste of a given pairwise comparison. For example, if three of four differentially methylated features were more highly methylated in haploid males, then the gene would be categorized as having elevated methylation in haploid males relative to diploid males. By contrast, if two of four differentially methylated features were more highly methylated in haploid males (with the other two more highly methylated in diploid males), then the gene would not be characterized as a directional DMG. Analysis of directional DMGs provided insights into the castes that most frequently exhibited elevated DNA methylation levels. In each comparison between haploid and diploid castes, we observed considerably more DMGs with elevated methylation levels biased to the haploid caste (figure 1c; electronic supplementary material, figure S1).

We conducted enrichment analysis of GO annotations for DMGs relative to methylated non-DMGs. We found that DMGs in S. invicta were enriched for annotations including ‘nucleotide binding’ and ‘developmental process’ (table 1; electronic supplementary material, table S2). By contrast, non-DMGs were enriched for terms related to core cellular functions such as ‘translation’ (table 1; electronic supplementary material, table S3), as is typical of methylated genes in general in S. invicta and other insects [25,37].

We further tested whether there were significant differences between DMGs and non-DMGs in a number of gene characteristics in order to better understand which types of genes are variably methylated. Specifically, we determined whether DMGs and non-DMGs differed in overall DNA methylation level (all castes combined), gene length, gene expression variability among samples as measured by the coefficient of variation [30] and rates of protein coding sequence evolution.

We found that DMGs exhibited substantially lower DNA methylation levels, and were substantially longer in terms of both coding sequence and gene body, than non-DMGs (table 2; p < 0.0001 in each case). DMGs were also modestly, but significantly, more variable in expression, and more highly conserved at the sequence level, than non-DMGs (table 2; p < 0.01 in each case).

We next investigated whether variation in DNA methylation was associated with variation in gene expression among castes. In order to investigate the regulatory significance of differential DNA methylation, we integrated available microarray gene expression data from S. invicta haploid males, diploid queens and diploid workers [30], as well as from a separate comparison of haploid and diploid males [33].

Our analyses revealed that directional DMGs with elevated methylation in haploid castes versus diploid castes were significantly more highly expressed in haploid castes than in diploid castes (figure 2; electronic supplementary material, figure S1). This finding is consistent with the observed association between DNA methylation and active gene expression in insects [25,38,39]. Intriguingly, however, we found no significant association between differential methylation and gene expression bias when examining genes differentially methylated between worker and queen castes (both diploid; figure 2b).

Gene expression bias is associated with directional differentially methylated genes (DMGs) in S. invicta. DMGs that exhibit elevated methylation in haploid males are more highly expressed in haploid males, whereas DMGs that exhibit elevated methylation in diploid (a) males, (b) queens or (c) workers are more highly expressed in diploid males, queens or workers, respectively. By contrast, there is no significant difference between the ratio of expression for DMGs that exhibit elevated methylation in (d) a pairwise comparison of queens and workers. Expression ratio data were standardized (mean zero, unit variance) following log2-transformation; p-values denote the results of Mann–Whitney U-tests. (Online version in colour.)

Finally, we determined whether directional DMGs between males of different ploidy were enriched for distinct GO annotations. Our goal with this analysis was to determine whether elevated methylation in haploid males, which may reflect an epigenetic state associated with haploid gene upregulation (figure 2a–c), was targeted to genes associated with distinct functions, when compared with other DMGs.

We found that genes with elevated methylation in haploid males relative to diploid males were enriched for several metabolic process terms, as well as the terms ‘nucleotide binding’ and ‘chromosome’ (electronic supplementary material, table S4). By contrast, there were no significantly enriched terms below the FDR cut-off (p < 0.05) for genes with elevated methylation in diploid males relative to haploid males. Nevertheless, several terms related to growth and development, including ‘developmental process’, were enriched among genes with elevated methylation in diploid males prior to FDR correction (p < 0.05; electronic supplementary material, table S5). Together, these data suggest a marked difference between the gene classes that exhibit elevated methylation in haploid and diploid males. Elevated methylation in haploid males appears to preferentially target genes associated with basal cellular processes, whereas elevated methylation in diploid males may be associated with a larger number of genes implicated in development.

We assessed patterns of DNA methylation and gene expression for S. invicta orthologues of genes associated with dosage compensation in Drosophila. Our goal was to provide initial insights into whether common molecular machinery may underlie dosage compensation for sex chromosomes in Drosophila and regulatory compensation for haploidy versus diploidy in S. invicta. Interestingly, we found that orthologues of four of eight genes (with data) related to dosage compensation in D. melanogaster were differentially methylated between haploid and diploid males in S. invicta (electronic supplementary material, table S6 and figure S2). Moreover, three of four of these genes (with data) were differentially expressed between haploids and diploids (electronic supplementary material, table S6). Thus, several genes involved in Drosophila dosage compensation are differentially methylated and differentially expressed between haploids and diploids in S. invicta.

4. Discussion

The purpose of this investigation was to gain a greater understanding of the molecular mechanisms that regulate gene function among individuals that differ in ploidy. Our analysis of DNA methylation patterns among S. invicta castes uncovered strong associations between levels of DNA methylation and ploidy. These methylation differences were further found to be related to gene expression differences among castes.

We found most DMGs arose between haploid and diploid castes, and therefore that the number of DMGs was not related to the overall morphological similarity of the castes being compared (figure 1). In particular, our comparison of haploid and diploid males produced more DMGs than our comparison of haploid males and diploid queens, which are sexually dimorphic, and produced many more DMGs than were observed between diploid queens and diploid workers, which are a classical example of insect polyphenism. Thus, in S. invicta, differences in DNA methylation more closely track differences in ploidy than differences in morphology, behaviour or physiology associated with distinct queen and worker castes [40].

The DNA methylomes of haploid males and diploid females were sequenced previously in the ants Camponotus floridanus and Harpegnathos saltator [18]. When we assessed directional DMGs between adult castes of C. floridanus and H. saltator, we found that, in four of six comparisons between haploid and diploid castes, more DMGs were elevated in haploids than in diploids (three of three comparisons in H. saltator and one of three comparisons in C. floridanus; electronic supplementary material, figure S3). Thus, the data of Bonasio et al. [18] further suggest that haploids may be prone to elevated DNA methylation relative to diploids.

Intriguingly, we found that DMGs, as a whole, exhibited several distinguishing characteristics in S. invicta. DMGs were enriched relative to non-DMGs for the GO annotations ‘nucleotide binding’ and ‘developmental process’ (table 1), consistent with important regulatory roles for differential DNA methylation in S. invicta, as in the honeybee [15,16,19]. Furthermore, DMGs differed significantly from other methylated genes in methylation level, gene length, expression variability and substitution rate (table 2), suggesting key architectural and regulatory differences between DMGs and non-DMGs.

In S. invicta, differential methylation events were also associated with ploidy-specific gene expression bias (figure 2), suggesting that DMGs are associated with regulatory differences between haploid and diploid genomes. Interestingly, the association of intragenic DNA methylation with active gene expression in insects suggests DNA methylation may be a useful marker of active chromatin states [25,37,38]. In support of this idea, the presence of DNA methylation has recently been linked to the presence of several active histone modifications in insect genomes [25,41]. We speculate that elevated haploid DNA methylation may be indicative of regulatory pressures associated with the single-copy state of haploid loci.

Notably, our data cannot directly address whether changes in DNA methylation are the cause or consequence of changes in gene expression. However, experimental investigations in model systems indicate the DNA methylation can cause changes in gene function through interactions with other components of chromatin. For example, DNA methylation has been shown to affect alternative splicing through its interaction with RNA polymerase II [42]. In addition, DNA methylation has been shown to alter the positioning of certain histone variants, which ultimately influence gene expression [43]. Experimental changes in levels of DNA methylation have also been found to lead to changes in levels of gene expression in Arabidopsis [43,44], suggesting that intragenic methylation has functional effects.

The suggestion that epigenetic gene regulation plays a role in genome-wide chromosomal dosage compensation is consistent with the observation that epigenetic marks play key roles in sex chromosome dosage compensation [20,21,45]. Intriguingly, we found that S. invicta orthologues of several genes implicated in D. melanogaster dosage compensation were differentially regulated between haploid and diploid castes (electronic supplementary material, table S6 and figure S2), raising the prospect of some degree of molecular convergence. Although the genome of D. melanogaster is not substantially methylated, previous studies have revealed that, in species that harbour functional DNA methylation systems, DNA methylation interacts with histone modifications associated with dosage compensation in D. melanogaster [25,41]. Regardless, we note that the mechanisms by which intragenic methylation affect gene function remain poorly understood [39], and direct connections between mechanisms of sex chromosome dosage compensation and ploidy compensation remain speculative at present.

Given the evidence for different epigenetic states in haploid and diploid S. invicta, it is important to consider why one may expect different regulatory requirements for genes in haploid genomes when compared with diploid genomes. For example, there may be increased metabolic requirements placed on loci in haploid, relative to diploid, genomes [1]. Our results agree with this notion, as several metabolic process GO annotations were enriched among genes with elevated DNA methylation in haploid males (electronic supplementary material, table S4). One additional reason for epigenetic states to differ between haploid and diploid genomes may be related to the amelioration of haploid gene expression noise, particularly at genes essential to cellular function. Indeed, gene expression variability is negatively associated with dosage in yeast, where diploid cells exhibit less expression variability than haploid cells [46], and where overall gene expression variability can lower organismal fitness [47]. We previously found that DNA methylation is negatively associated with the coefficient of variation of gene expression among replicate S. invicta samples [25], potentially implying a role for DNA methylation in the stabilization of gene expression [48]. We speculate that, if DNA methylation plays a role in reducing gene expression stochasticity [48], the variable expression of haploid loci may itself provide an impetus for elevated levels of DNA methylation in haploid males.

Overall, our results suggest that epigenetic mechanisms are associated with regulatory response to global differences in dosage in haplodiploid hymenopterans. However, we must emphasize that these results are preliminary in nature, requiring additional study to resolve whether epigenetic information is functionally implicated in ploidy-associated regulatory compensation. One important consideration is that haploid males in Hymenoptera are known to compensate for lower genomic content relative to diploid females through endoreplication [6,9], wherein cells increase their genomic content without dividing [1]. Our results raise the possibility that epigenetic information similarly contributes to haploid regulatory compensation, particularly given that endoreplication is not ubiquitous among tissues [6]. An alternative, but presently unexplored possibility is that endoreplication itself is associated with epigenetic changes.

We have shown that differential DNA methylation is more closely linked to ploidy variation than to queen and worker castes in the fire ant S. invicta. We observed elevated DNA methylation in haploids, and a positive association between ploidy-biased DNA methylation and gene expression, which together demonstrate the existence of distinct epigenetic states for haploid and diploid genomes. Overall, our results highlight the prospect that epigenetic mechanisms may be involved in achieving ploidy compensation in haplodiploid taxa.

Funding statement

This work was supported by the US National Science Foundation (grant nos. DEB-1011349, DEB-0640690, IOS-0821130 and MCB-0950896) and the Georgia Tech–Elizabeth Smithgall Watts endowment.