Figures

Abstract

In plants, relationships between resistance to herbivorous insect pests and growth are typically controlled by complex interactions between genetically correlated traits. These relationships often result in tradeoffs in phenotypic expression. In this study we used genetical genomics to elucidate genetic relationships between tree growth and resistance to white pine terminal weevil (Pissodes strobi Peck.) in a pedigree population of interior spruce (Picea glauca, P. engelmannii and their hybrids) that was growing at Vernon, B.C. and segregating for weevil resistance. Genetical genomics uses genetic perturbations caused by allelic segregation in pedigrees to co-locate quantitative trait loci (QTLs) for gene expression and quantitative traits. Bark tissue of apical leaders from 188 trees was assayed for gene expression using a 21.8K spruce EST-spotted microarray; the same individuals were genotyped for 384 SNP markers for the genetic map. Many of the expression QTLs (eQTL) co-localized with resistance trait QTLs. For a composite resistance phenotype of six attack and oviposition traits, 149 positional candidate genes were identified. Resistance and growth QTLs also overlapped with eQTL hotspots along the genome suggesting that: 1) genetic pleiotropy of resistance and growth traits in interior spruce was substantial, and 2) master regulatory genes were important for weevil resistance in spruce. These results will enable future work on functional genetic studies of insect resistance in spruce, and provide valuable information about candidate genes for genetic improvement of spruce.

Funding: This work was carried out with financial support from Genome British Columbia and Genome Canada. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Plants are sessile organisms that have evolved many resistance mechanisms to defend against insect pests. These resistance mechanisms are genetically complex and involve interactions between both host and pests [1], [2]. Recently, a “cost-benefit” paradigm for resistance has emerged to enhance our understanding of these interactions [3]–[7]. This paradigm suggests that tradeoffs in the cost-benefit paradigm may be due to correlated selection (favored trait combinations) and spatio-temporal heterogeneity of the environment [8]. Theoretical approaches have also described the importance of resource allocation within biosynthetic pathways for the evolution of resistance [9].

Meta-analyses of plant-herbivore defenses suggest that trade-offs exist between constitutive and induced defenses. More competitive species tend to exhibit lower constitutive and higher induced resistance than less competitive species ([10], [11]). It has also been hypothesized that both constitutive and induced resistance are influenced by selection on traits that alter plant growth rates [12]. In spruce (Picea spp.), constitutive and induced defenses are thought to follow sequentially [13]. Strength and rapidity of traumatic resinosis have often been related to resistance. Nevertheless, Alfaro [13] suggested that in response to wounding, some resistant trees failed to produce the traumatic response and some susceptible trees responded with an unexpectedly intensified response.

Phenotypic and genetic relationships between growth and resistance to white pine terminal weevil (Pissodes strobi Peck.) have been intensively studied in interior spruce (Picea glauca [Moench] Voss, P. engelmannii Parry and their hybrids); however, results have been inconsistent and seemingly contradictory. Kiss and Yanchuk [14] found a negative genetic correlation between mean family height and weevil damage in interior spruce, while King et al. [15] reported a positive phenotypic but a strong negative genetic correlation between attack level and leader height. Alfaro et al. [16] reported better developed bark resin canals in fast-growing trees, while Lieutier et al. [17] concluded that there is no relationship between tree growth and resistance in Norway spruce (Picea abies. ((L.) Karst.). Vandersar and Borden [18] suggested that weevils prefer faster growing trees, and more recently He and Alfaro [19] found a higher survival time for shorter trees. In Sitka spruce (Picea sitchensis (Bong.) Carr.), genetic resistance was most pronounced in families with average height growth [20]. This finding is interesting, since improved growth rate in spruce trees has lead to a higher predisposition to weevil attacks [21].

Trade-offs between correlated traits may be due to genetic and/or phenotypic variation. At the least, genetically correlated traits share quantitative trait loci (QTLs). However, pleiotropic genes, which control the hubs in such trade-offs, are difficult to distinguish from the confounding physically linked loci within a shared QTL. Moreover, biometric correlations and QTLs do not always concur because of the presence of obscuring antagonistic QTLs [22]. This failure to detect significant correlations may indicate the extent of independent variation between two traits and not necessarily the absence of a tradeoff. Other interacting factors can remain undetected [8]. In other species, pleiotropy and genetic correlations may be present. Examples include dehydration avoidance and flowering time in Arabidopsis thaliana[23], resistance and tolerance to herbivory in the common morning glory [24], and growth rate and flowering in A. thaliana[25]. While A. thaliana has facilitated research on tradeoffs for life-history traits in annual plants with short life cycles, research on long-lived forest trees promises new perspectives on molecular mechanisms of life-history control in non-model species [26].

In this paper, we present results on the use of genetical genomics to investigate a question of fundamental importance in plant genomics: How do genes underlying a pathway to pest resistance concertedly function? We investigated growth and insect resistance as a trait pair that defines the life history of interior spruce, a commercially valuable and ecologically important coniferous tree species. We show how genetical genomics can provide a fine-scale analysis of the genetic architecture in the study of pest resistance cost-benefit tradeoffs. Genetical genomics assays thousands of traits (gene expression levels) [27] and these “expression phenotypes” are subjected to standard QTL analysis. We use genetical genomics to infer the nature of resistance of interior spruce to the white pine weevil. We infer expression QTLs (eQTL) in segregating crosses of interior spruce, with variable resistance to white pine weevil. In this analysis, the positions of eQTLs indicate regions that harbor regulatory elements that control expression of genes in the same pathway. In the case of cis-regulation, the genomic location of the eQTL coincides with the physical location of the regulated gene, while trans-acting eQTLs identify regulatory elements for the gene elsewhere in the genome. The distribution of eQTLs may spread evenly on the genome or may appear in clusters or “hotspots”, depending on the genetic architecture of these gene interactions [28]. At the QTL level, the action of a gene might suggest pleiotropy because multiple traits are affected. We search for pleiotropy based on common candidate genes between resistance and growth. Furthermore, a positive correlation between growth rate, and attack and oviposition rate (our resistance measure) might indicate a tradeoff between growth and defenses. This tradeoff could be due to the increased carbon cost required for higher defense chemical levels. We also search for master regulons that underlie trans-eQTL hotspots (“hubs”) which tend to be at the center of gene expression networks (network eQTLs).

Results

Correlations between Phenotypic Estimates

The phenotypic response data consisted of tree height measurements and weevil attack and oviposition counts. Tree height measures were taken at the time of planting in 1995 and at three and five growing seasons thereafter. Leader length was measured in year five preceding the artificial augmentation of a local weevil population in October of the same year (hgt_1995, hgt_1997, hgt_1999, and ldr_99, respectively). Weevil attack rates counted in 2000 and 2001 were classified as successful top kills, failure to kill the leader, and no attack (atk_2000, atk_2001). In the same years, oviposition was assessed (egg_2000, egg_2001) and egg counts along the leaders were summarized into five discrete classes. The sum of weevil attacks and the sum of oviposition for 2000 and 2001 were also used as response traits (sum_atk, and sum_egg).

Forty-five pairwise phenotypic correlations were estimated for individual and sum traits (Table 1). In general, correlations between egg counts and attack classification were strong, between initial height (hgt 1995) and later growth correlation was weak, and between hgt_1999, ldr_99 and hgt_1997 correlations were strong. Correlations between growth (ldr 99 and hgt 1999) and attack (resistance) traits (atk_2000, egg_2000, sum_atk and sum egg) were generally positive (Table 1). Collectively, these results suggest a negative relationship between growth and the actual pest resistance (fewer attacks).

QTL Mapping

The spruce mapping population was genotyped for 384 SNP loci. These SNPs had been detected within expressed sequence tag (EST) contigs that represented assemblies of short expressed sequences with predicted open reading frames. These ESTs originated from the spruce Treenomix EST database (K. Ritland, personal communication). The genotypic information was used to estimate pairwise recombination rates between SNP loci and construct a framework genetic linkage map to localize quantitative trait loci (QTLs). Details about the genetic linkage map and the annotation of the mapped contigs can be found in the supplement material of [29]. The phenotypic variations that were obtained from four tree height, three weevil attack, three oviposition (see above), and extensive quantitative gene expression (21,840 transcripts) measures were mapped to the established genetic linkage map of the factorial progeny. The QTLs were mapped by using a likelihood function to assess the phenotype effect conditioned on the genotypic variation. A (e)QTL was significant with a LOD ≥3.84. In total, we identified 132,100 significant eQTLs (see File S1 and legends in Table S1).

For each SNP locus along the genetic linkage map, we superimposed the mapped phenotypic trait QTLs (pQTLs) with the counts of significant eQTLs (Figure 1), and identified hubs of trait variation at several SNP loci that comprised of multiple pQTLs and an extensive accumulation of eQTLs. A goodness-of-fit test that assumed a uniform distribution was performed to test whether the observed frequencies of eQTLs along the linkage map differed significantly from the expected value. Then we used all detected eQTLs and all marker loci (see above) in a randomization procedure to assess the maximum number of eQTLs within eQTL clusters. According to this randomly generated data set, “eQTL hotspots” would be declared if the number of eQTLs at a given locus exceeded 630. However, we arbitrarily used a cutoff value of 786. This number was simply the value where the expected average value was exceeded by 50%. In this way, we focused on fewer hubs of trans-eQTLs that are associated with important regulators of quantitative trait variation. Fourteen loci coincided with eQTL hubs and with at least three pQTLs (Figure 1).

Hotspots of Phenotypic Trait Variation

From these 14 loci, seven loci were associated exclusively with resistance pQTLs, three loci with growth pQTLs, while four loci with pQTLs from individual traits of both growth and resistance, respectively (Figure 1). The composition of two trans-eQTL hotspots with extensive resistance pQTL overlap was analyzed in more detail (see below and Figure 2, and Figure 3; Table S2, and Table S3).

The two resistance trait associated eQTL hotspots on LG6 were localized at two annotated bona fide CCoAOMT genes that are separated by only 1.5 centiMorgan map distance, Figure 1. In a previous study that focused on spruce gene families from the phenylpropanoid pathway, we identified the same region enriched for trans-eQTLs [29]. Both CCoAOMT genes also have cis-eQTLs that might represent promoter polymorphisms regulating the differential gene expression in those two loci. At the two different CCoAOMT loci a common set of pQTLs as well as eQTLs clustered (53% and 36% of their accumulated trans-eQTLs were generated by the same transcripts, respectively). We compared gene annotations to the 8366 unique Arabidopsis annotations from the entire microarray. For both loci the categories ‘structural molecule activity’, ‘secondary metabolic process’, ‘ribosome’, ‘response to biotic stimulus’, and ‘cytosol’ were overrepresented. A Fisher’s exact test was employed to assess significance. Details about the overrepresented ‘GOSlim Plant’ categories within the two eQTL hotspots can be found in Figure 2 & Figure 3. Five jasmonate-ZIM-domain protein (JAZ) genes (WS0105_K14, WS00918_B02, WS0063_E19, WS00918_P17, and WS00919_H21) contributed with eQTLs to this hotspot region, and expression variation for three JAZ genes mapped to both CCoAOMT loci (WS00918_B02, WS00918_P17, and WS00919_H21). Three JAZ genes (WS00919_H21, WS0063_E19, WS00918_P17) were candidate genes directly associated with phenotypic variation for ht_1999, atk_2000 and atk_2001 traits, respectively (Table S2, Table S3, and Table S4). Transcripts from three putative carbonic anhydrase genes (WS00110_A15, WS00928_K21, and WS00936_A24) mapped trans-eQTLs to both hotspot locations.

The cluster of three phenotype associated eQTL hotspots on LG13 deserves further attention (Figure 1), yet we were unable to relate this group of markers to any of the 12 major linkage groups. This might be due to a limited number of segregating markers. At two SNP markers on LG13, SNP250 and SNP251 (WS0021_L13_301 and Contig_2062_390, [29]), related to sequences of glutamate decarboxylase (GAD) genes, significant growth variation and extensive expression variation co-localized.

On four spruce linkage groups (LG4, LG6, LG11 and LG13) hotspots of expression variation associated with QTLs from both growth and resistance traits (Figure 1). The pQTLs from resistance traits explained a higher % variation of the trait variation than pQTLs from growth traits. The largest pQTL at any of these eQTL hotspots was identified for sum_egg at SNP44 (on LG 3) and explained 10.3% of the trait variation. Also, for traits atk_2000, sum_atk, egg_2000 as well as for egg_2001 pQTLs explaining a higher portion of phenotypic variance (4.6–7.5%) mapped to this locus. Along the entire linkage map, most eQTLs mapped to this locus (2040 transcripts). At two loci, SNP74 (LG4) and SNP252 (LG13), the pQTLs from the same five traits (hgt_1997, hgt_1999, ldr_99, atk_2000 and egg_2000) were associated with the individual eQTL hotspots. On LG6, pQTLs from ldr_99, atk_2000 and sum_egg, while on LG11, pQTLs from hgt_1999, ldr_99, atk_2000, sum_atk and egg_2000 associated with the eQTL hotspot. At SNP210 related to ubiquitin conjugating enzyme 2 (LG11), the identified eQTL hotspot was associated with three resistance and two growth traits. In all four cases of eQTL hotspots where several pQTLs for growth and resistance traits collocated, the allelic effects of the pQTls for growth and resistance traits had the same sign. This indicates a positive correlation between the growth trait variation (as assessed by height data) and the resistance trait variation (from attack rates and oviposition data).

Positional Candidate Genes for Resistance and Growth Trait Variation

The combination of phenotype and gene expression datasets facilitates studying the genetic control of phenotypic traits of interest [30]. “Positional” candidate genes can be identified as genes for which transcript abundance correlates extensively with the quantitative phenotype.

Here, the positional candidate genes were identified by collocation of at least 40% of their eQTLs with pQTLs based on the criteria for identifying significant QTLs and running 10,000 randomizations (p ≤ 0.05), see Material and Methods. Thus, extensive co-segregation of transcript variation with the phenotypic trait of interest identifies positional candidate genes that are directly underlying the trait. We arbitrarily defined map intervals of 10 cM to measure collocation (on average 2 SNPs were binned into 10 cM). Thus, within a resolution window of 10 cM at a given SNP locus we determined the significance of co-localizations between the expression variation that was detected at a certain EST that was spotted on the microarray (gene spot) and the phenotypic trait variation (p ≤ 0.05). We screened 21,840 transcripts that represented distinct ESTs spotted on the microarray for co-localization with growth traits and for co-localization with resistance traits, respectively, and identified 1621 and 2002 distinct ESTs, respectively. These numbers comprised of the following trait associations: ldr_99 (217 gene spots), hgt_1995 (254), hgt_1997 (385), hgt_1999 (878); atk_2000 (346), atk_2001 (311), sum_atk (546), egg_2000 (361), egg_2001 (335), sum_egg (584), respectively (Table S4). Not unusual for conifers, many of those spruce genes had no Arabidopsis homologues. In total, 1191 gene spots from co-localizations with resistance traits gave hits with unique Arabidopsis entries (59%); 1000 gene spots from co-localizations with growth traits had unique annotations (62%).

We compared annotations independently for resistance and growth with the unique Arabidopsis annotations from the array. Twelve, and eight GOSlim Plants categories, respectively, were overrepresented among genes with expression variation significantly associated with individual phenotypic traits. The categories for resistance trait associations involved: ‘response to stress’ (64 compared to a total of 329 genes on the array), ‘response to abiotic stimulus’ (54/272), ‘cellular component organization and biogenesis’ (94/479) and ‘cytosol’ (32/145) as well as ‘binding’ (298/1908); Figure 4 and Table S5. Growth trait categories were related to ‘extracellular region’ (8/32), ‘signal transducer activity’ (15/78), ‘cell communication’ (38/204), ‘response to stress’ (51/329) categories and the ‘cell wall’ category (19/63); Figure 5 and Table S5.

In the gene lists a large number of kinases, phosphatases as well as transcription factors were identified; Table S4. Phosphorylation and dephosphorylation are important steps in various biosynthetic processes, and in signal transduction cascades within the organism. For resistance traits we counted 104, for growth traits 91 transcription (-related) factors/proteins. We identified 49 kinases associated with the growth traits, whereas 63 with the resistance traits. Phosphatases totaled to 22 candidate genes both for growth and for resistance traits, respectively.

A number of auxin-related genes (including ABC transporters) co-localized with phenotypic trait variation: for resistance we found 19, for growth 15 co-localizing. A number of genes involved in embryo arrest/deficiency also co-localized with the phenotypes: 19 for resistance and 22 for growth traits. Jasmonic acid (JA)-forming and ethylene-forming/−responsive genes were also identified candidate genes based on collocations with the respective phenotypic traits.

The isoprenoid biosynthesis pathway generates many compounds relevant to plant defenses (terpenoids, tocopherol, e.g.) but also the precursors of ‘plant hormones’ like gibberellins (GA) and abscisic acid (ABA). Eight biosynthetic genes co-localized with resistance traits and six exclusively with the hgt_1999 trait. In addition, both for growth and resistance traits, four GA-regulated/GA signaling-regulating, and two ABA-related proteins were identified as potential candidates.

The phenylpropanoid metabolic pathway provides various specialized metabolites important in plant development, polymeric lignin for structural support, anthocyanins for pigmentation, flavonoids with various protective functions, and antimicrobial phytoalexins [31]. Thirty-seven putative gene family members associated with individual resistance traits, while 25 with growth traits. For the traits related to egg counts (26 transcripts) and to height in 1999 (15 transcripts) the highest number of candidates from this pathway was identified.

Genes that are significant for both resistance and growth traits (p ≤ 0.05; 10,000 randomizations, see Materials and Methods) are summarized in Table S6 (see Table S4 for comprehensive results). In sum, 244 genes were identified. The majority of these genes co-localized with Ht 1999, and also co-localized with atk_2000, egg_2000, respectively, resistance traits assessed in the next growth season. Since many identified genes have no angiosperm counterparts, they likely represent novel conifer-specific genes at the pivotal points of growth variation and defense. For one DNAJ heat shock family protein its eQTLs significantly co-localized with QTLs from as much as seven individual traits (Table S6). The functions of these chaperones are related to environmental challenges (involving stress tolerance) but are manifold due to the complexity of the whole gene family. Among the candidates in common between resistance and growth traits were genes involved in normal/optimal plant growth, stress signaling, defense, stress tolerance, glycine betaine synthesis, DNA repair, transcription regulation, post-transcriptional regulations, protein degradation as well as expansins with a proposed function related to cell wall architecture during rapid tissue expansion (Table S6).

Significant associations with all four individual growth and six individual resistance traits, respectively, allowed us to robustly identify 149, and 99 candidate genes for the composite ‘resistance’ and ‘growth’ phenotype, respectively (Table S7 and Table S8). Gene identities with annotations are provided in Table 2 and Table 3. For about half of the gene spots identified as candidates no putative functions were unraveled by BLAST searches. Candidate genes for ‘resistance’ and ‘growth’ differ markedly in their functions. While collocations of expression with growth variation were predominantly found for gene products involved in (post-) transcription and post translational regulation (in total 18), collocations with resistance variation identified a larger number of biosynthetic proteins (17), signaling (20) and transporter/transport related molecules (13). We found that 19 genes are positional candidates for the composite ‘resistance’ phenotype, but are additionally associated with other growth trait(s); these genes are typically involved in signaling, transport and biosynthesis related processes. The 29 genes that are positional candidates for the composite ‘growth’ phenotype and at the same time associated with individual resistance traits have proposed functions in transcriptional or (post-) translational control, growth and cell wall remodeling.

Correlations between Co-localization Estimates

We determined genetic (QTL) correlations based on co-localization estimates between gene expression and trait variation, Table 1. The correlation between the general ‘growth’ and ’resistance’ trait based on associated expression variation of transcripts was significant (R = 0.251). However, the positional candidates for the general ‘growth’ and ‘resistance’ phenotypes were distinct. Co-localization estimates for hgt_1997, ldr_99 and hgt_1999, respectively, correlated with co-localization estimates for atk_2000, egg_2000, sum_atk as well as sum_egg traits. This means that a significant fraction of their eQTLs co-localize with both growth and resistance QTLs. Overall, 12% of the genes that were positional candidates for individual height growth traits were also positional candidates for individual resistance traits, and 15% of the genes that were positional candidates for resistance traits were also positional candidates for height growth traits.

Discussion

Our work on spruce weevil resistance follows similar work on eucalyptus [32] and poplar ([33], [34]). Ours is the first study of expression QTLs for resistance in a conifer. Despite the enormous genome size of conifers (ca. 20 billion base pairs) studies on the transcriptome of conifers are just as manageable as those with angiosperms with much smaller genome sizes. Here we utilized a third generation microarray spotted with 21,840 spruce ESTs in combination with a multiplexed genotyping approach to examine expression QTLs involved with weevil resistance and height growth. This work represents an extended study of [29] that previously focused in detail on the phenylpropanoid pathway and related genes with respect to pest resistance in spruce.

In our experiment, we harvested plant material in spring at the optimal time point at which early seasonal growth and natural onset of weevil attacks coincided. Our microarray consisted of 70% cDNA from untreated tissue of many tissue types; no overrepresentation of specific metabolic pathways was attempted. The issue of cross-hybridizations in microarray experiments due to high nucleotide similarity [35], is reduced in genetical genomics because of the randomized genetic background, high sample size and the statistical procedures. Cross-species comparisons of QTL are also possible based on the white spruce/loblolly pine comparative mapping project (K. Ritland, pers. comm.). Linkage groups one to twelve (LG1-12) were assigned following [36] to facilitate comparisons within the family of Pinaceae.

The main focus of the present work was scanning the genome for transcripts whose abundance correlated with the quantitative phenotype in order to identify transcripts associated with the phenotype ([37]). We assigned groups of genes that significantly associated with individual resistance and growth traits, respectively, into functional, cellular component, and biological process GO categories. These genes were co-regulated and likely have combined functions in the studied phenotypes. Thus, the present work elucidates functional associations among genes and provides a comprehensive study to the evolution of transcription regulation in spruce. Overall we found that a significant fraction of eQTLs were in common between the general ‘growth’ and ‘resistance’ phenotypes. This result was based on expression variation from all studied transcripts. This provides evidence for genetic pleiotropy of resistance and growth traits in interior spruce. In terms of directionality of gene expression with phenotypic trait variation, we found that the mean of correlations between transcript expression and quantitative traits was zero, however overall we found a wide distribution of correlations where some genes showed a clear positive, whereas others showed a clear negative correlation with traits (K. Ritland, personal communication).

Contribution from Single Gene QTL

From the myriad of candidate genes that were identified for the individual resistance traits (Table S4; Figure 4), our study also identified an array of single genes that were associated with both resistance and growth phenotype (Table S6 and Table S4). The identification of shared candidates suggests that several of the general functionalities (notably, the signaling systems, [38]) important to normal plant development are also adopted for defense mechanisms. Among those ‘pleiotropic’ genes many had functionalities that prevalently involve signaling, transcription factor activity, functions in transcription/translation (including RNA editing), stress/stimulus response, as well as transport and cell wall functions. Transcription regulators have been suggested to be key targets for plant evolution ([39], [40]). Their high representation in our gene lists reinforces their broad importance to the sessile organism’s potential to optimize growth under the given environmental conditions.

Multigene Family QTL Contributions

Large multi-gene families were represented in the associations with both growth and resistance traits. These involved GDSL-motif lipase/hydrolase, GHs, LRR proteins, oxidoreductases, PPR proteins, disease resistance proteins, and DNAJ heat shock family proteins. While GDSL, LRR, PPR and DNAJ proteins were equally represented among growth and resistance candidate genes, respectively, GH, oxidoreductases, and disease resistance proteins contributed twice as many resistance candidates than growth candidates. From this diverse group of disease resistance genes, two spruce sequences with similarity to f-family dirigent proteins that are implicated in constitutive resistance [41] were identified as candidates for weevil resistance. Several individual members were directly associated with both phenotypes and hence pleiotropic (one member in each case for GDSL-motif lipase/hydrolase, disease resistance proteins, and DNAJ heat shock family proteins; two each for GHs, and oxidoreductases; four each for LRR proteins, and PPR proteins). The spruce DNAJ heat shock protein co-localized with expression variation of seven individual traits (both resistance and growth), Table S6. Most of these candidates have previously been reported to be involved in defense reactions, some with proposed antifungal properties [42]–[47].

Dominant Themes among Gene Functions Associated with the Resistance Phenotype

The statistically overrepresented ontology categories among genes that were associated with resistance phenotypic traits revealed dominant themes in gene expression that involved response to all sorts of stimuli (biotic, abiotic, external, and endogenous), epigenetic gene expression, and translation (ribosome) for the de novo generation of gene products. Additionally, remodeling processes that involve the (re-) organization of cellular components, and anatomical structures by proteins that provide binding functions and other structural molecule activities are important components of defense (Figure 4). Interestingly, signaling is the most common molecular function and cellular process among growth phenotype associated genes (Figure 5). For genes associated with the composite weevil resistance phenotype, functions in signaling were also predominant (Table 2). There is evidence that signaling pathways in defense reactions are co-opted from normal developmental processes, see also above. Genes that have signaling functions and are negative regulators of ABA responsiveness [48] were identified (phosphatase 2C protein for resistance, while ABI-1-LIKE 1 for growth, Table 2 and Table 3). It is assumed that these gene functions allow for the fine-tuning of stress responses. We could also identify an ATP synthase (Table 2). Recently, it was shown that the initiation of multiple defense elicitors in the host is triggered by herbivore proteolysis of a plant ATP synthase [49].

Biosynthesis is also an important function for genes associated with the composite resistance phenotype (Table 2). Two genes annotated as mannitol dehydrogenase were identified. Mannitol dehydrogenase counteracts the fungal suppression of the reactive oxygen species that are generated during host defenses [50]. Furthermore, several genes linked to the general phenylpropanoid pathway, ([32], [31], [51]) and to flavonoid and isoflavonoid biosynthesis [52] were positional candidates for the general resistance trait.

The observed eQTLs were the result of constitutive differences in gene expression. The importance of polyphenolics for constitutive defenses is reflected in a higher number of gene family members of the phenylpropanoid pathway (by homology to A. thaliana genes) whose eQTL co-localized with resistance QTLs. In this work, seven genes related to phenolics or flavonoid biosynthesis were positional candidates for the general resistance trait (Table 2). In contrast, no candidate gene for the resistance trait per se could be identified from the terpenoid pathway. Hence, we feel that this might reflect the higher importance of the phenolics over the terpenoid pathway in established resistance against this herbivore. We have previously suggested that monolignol formation may play an important role in defense reactions against the stem borer Pissodes strobi[29]. Based on in-depth analysis of genes involved in the shikimate pathway, monolignol biosynthesis and downstream condensation reactions as well as lignan formation with respect to weevil resistance, we further conclude that gene family members that were duplicated in spruce may have acquired temporally and spatially diverse functions in defense [29].

Trans-eQTL Hotspots and their Significance

Certain phenotypes may be affected by gene expression regulators located within eQTL hotspots [30]. Although several previously conducted eQTL studies suggest that cis-eQTLs might have a greater effect on the phenotype than trans-eQTLs [53], trans-eQTLs are important for our understanding of the complexity of phenotypes [54]. For example, by comparing the levels of trans-eQTLs for each gene the global regulatory hierarchy can be assessed [53]. While cis-eQTLs are physically linked to the causative locus of the phenotype, trans-eQTLs can identify many downstream genes and reveal unknown pathways. In our study, we were mainly limited to the detection of trans-eQTLs, since the majority of SNP loci on our genetic linkage map could not be annotated [29]. This limitation is due to the fact that we were working with a non-model species and in particular with a conifer of immense genome size for which the genome sequencing has yet to be completed (http://www.congenie.org/).

Several trait-associated SNPs that were enriched for trans-eQTLs were identified in our study. At seven map positions, hotspots of expression variation coincided with QTLs from multiple resistance traits (LG3, LG4, LG6 and LG8). At eight SNP positions at least four pQTLs overlapped with eQTL hotspots. On four spruce linkage groups (LG4, LG6, LG11 and LG13) hotspots of expression variation associated with QTLs from both growth and resistance traits (Figure 1). This indicates gene expression regulators[30].

For example, on LG13 the two SNP loci underlying extensive expression and growth variation (i.e. large number of mapped QTLs) are derived from GAD enzymes whose activity regulation is vital for normal plant development. This allows response to external stimuli [55]. In addition, the enzyme may also function in a host deterrence reaction towards herbivore attack [56]. The eQTL hotspot on LG11 was associated with three resistance traits and two growth traits. The SNP that is located within the eQTL hotspot region is a gene that plays an important role within the ubiquitin/proteasome system, regulating developmental processes in plants, but also involved in biotic defense responses [57]. SNP markers derived from two different contigs Contig_4096_434 and CCoAOMT_1_320, respectively, clustered on LG6 and represent two of the three lignin-forming CCoAOMT genes in spruce [29]. We identified cis-eQTLs for these genes as well as trans-eQTLs generated from a multitude of genes that mapped to the two loci. Both loci are also hotspots of weevil resistance QTLs. A GO analysis of the transcripts associated with the two trans-eQTL hotspots revealed significant over-representation of several molecular function, cellular component, and biological process GO categories. The fact that 36% and 53%, respectively, of the mapped trans-eQTLs were in common between CCoAOMT-1 and CCoAOMT-2 suggests extensive interactions between both CCoAOMT loci via eQTLs from a multitude of genes. This was also reflected by common GO categories that were overrepresented such as ‘secondary metabolic process’ and ‘response to biotic stimulus’ in both gene expression networks centered at these two SNPs (Figure 2 & Figure 3). Thus, this demonstrates how epistasis between gene loci works at the transcriptional level by linking cis-eQTLs via trans-regulatory interactions. In the case of CCoAOMT-1 and CCoAOMT-2, three resistance pQTLs also contributed to this epistatic interaction. CCoAOMT-1 and CCoAOMT-2 represent both metabolic pathway–specific trans-eQTL hotspots [29] and based on the present study, they represent important global trans-eQTL hotspots that are of interest for pest resistance in spruce.

Jasmonate Signaling and its Central Role in Defense

We also identified JAZ genes that may play a role in the gene-gene interaction network between both CCoAOMT loci. JAZ genes were identified as central regulators of JA-mediated anti-insect defense [58]. Three JAZ genes were candidate genes directly associated with phenotypic trait variation. JA signaling is activated by repressor (i. e. JAZ) removal from the ubiquitin ligase complex [59]. Carbonic anhydrase genes are other important genes that have roles in jasmonate signaling and also in ethylene signaling [60], and in our study these genes mapped trans-eQTLs to both CCoAOMT hotspot locations. A previous study found that carbonic anhydrase genes were induced in spruce under stress treatments (budworm, weevil feeding, and mechanical wounding) [35]. At the second CCoAOMT locus on LG6, eQTLs from ERFs and specifically those from group IX [61] that represent known transcription repressors (ERF3, ERF4, ERF7) [62] were found.

In our study, transcriptional activators related to ethylene response (ERF-1, ERF-2, EIN5, [63]) as well as regulators for ethylene biosynthesis per se (RUB1, RUB1-conjugating enzyme, [64]) were found to co-localize exclusively with resistance traits. The hormonal cross-talks with JA (involving salicylic acid, ethylene, abscisic acid, and auxin) during growth and development as well as during adaptation to stress are highly complex [59], [65]. In Arabidopsis, the major players in JA-mediated plant defense are tightly linked [66]. The differential regulation of certain components/steps in the pathway is expected to generate distinct responses to different stimuli (reproductive development, growth or defenses, [59], [66]). Thus, establishing and maintaining defenses involves signaling systems that are co-opted from developmental processes [38].

Conclusions

Although genomics studies on forest trees have traditionally focused on wood attributes [67]–[70], the genomics of environmental challenges has recently gained importance [34], [71]. Biotic stressors, herbivores and their accompanying pathogens pose an increased threat to tree populations, and knowledge of the genomic architecture can inform the management and breeding practices of conifers, as well as increase our general understanding of the evolution and adaptation of conifer species. Our result will add to this second vary important layer of genomics in forestry.

We have utilized expression QTL mapping to identify candidate genes. This will facilitate targeted association studies to further understand the genetic basis of host resistance to pests, the genomic basis of pest resistance. These results will enable both further functional studies to the nature of insect resistance in spruce, and provide valuable information about candidate genes for genetic improvement of spruce.

We identified several master regulators that underlie the genetic pleiotropy of pest resistance and developmental processes. Several candidate genes from the JA signaling pathway were identified for which we could show that central regulators of this pathway are contributing to extensive gene-gene interaction networks. Plant JA signaling provides a rapid response to various external stimuli [72] and is central to all biotic stress responses that directly influence the performance of the pest or contribute indirect defense responses to attract predators or herbivore parasitoids [73]. Importantly, this signaling pathway is not defense specific, but co-opted from normal developmental processes such as reproductive development [72]. In this way, the induction of defenses against herbivores or pathogens remains highly cost effective [38].

This work identified several pleiotropic genes as candidate genes whose proposed functions are important in stress response or disease resistance. In addition, our study revealed the presence of master genes which influence the global transcriptome. These genes are in “hotspots”, sometimes linked to annotated loci which were in turn further annotated to developmental and defense associated processes. Since resistance and growth QTLs overlapped with eQTL hotspots along the genome, this suggests that: 1) genetic pleiotropy of resistance and growth traits in interior spruce was substantial, and 2) master regulatory genes were important for weevil resistance in spruce. Knowledge about the exact function of these master regulons in the conifer genome needs further investigation; however knock-out mutants for largely pleiotropic genes were shown to be largely lethal or exhibit highly deleterious phenotypes [53].

Materials and Methods

Interior Spruce Pedigree

Experimental interior spruce populations originated from a controlled-cross progeny trial established in 1995 at Kalamalka Research Station in Vernon, BC, Canada [74]. The parental trees were selected from individuals previously ranked for weevil-resistance in open-pollinated progeny tests [14]. Out of twenty crosses segregating for weevil-resistance [74], four families with wide segregation for weevil resistance arranged as 2x2 factorial were harvested in May 2006 for gene expression profiling: cross 26 (♀PG87*♂PG165), cross 27 (♀PG87*♂PG117), cross 29 (♀PG21*♂PG165) and cross 32 (♀PG21*♂PG117). From 417 offspring of a 3x2 factorial (including the additional crosses 22 and 25), genomic DNA was isolated from flushing bud/needle tissue according to the cetyltrimethylammonium bromide (CTAB) method [75]. The studied trees represented individual genotypes that were planted in randomized plots within three replicate blocks in the field [74]. A detailed layout of the study site that shows the randomized location of plots for the QTL mapping families PG87*PG165 (cross 26), PG87*PG117 (cross 27), PG21*PG165 (cross29) and PG21*PG117 (cross32) within the replicate blocks can be found in [29].

A set of 384 SNPs were identified in silico from a collection of ESTs from the Treenomix EST database (K. Ritland pers.comm.) that were all derived from a single tree (PG 29). The genomic DNA was then genotyped for these SNPs using the multiplexed Illumina platform at the CMMT Genotyping and Gene Expression Core Facility, Centre for Molecular Medicine and Therapeutics, Vancouver, BC. Recombination rates were determined by joint likelihood [76] for each pair of loci and a consensus genetic map of 252 SNPs was constructed using JoinMap 3.0 [77], see [29] for details. Of all putative SNP loci, 73.4–76.0% were confirmed and included in the analysis; 394 individuals were true full-sibs. Those that could not be confirmed as full-sibs in the respective crosses (cross 26: 7%, cross 27: 10%, cross 29: 4%, and cross 32: 1% of the trees alive in 2006) were removed before phenotyping. The majority of spruce gene markers (i.e. the SNPs) that were used to build the framework map could not be annotated. This involved 67% of the ESTs when we used the TAIR7 database, while 54% of the ESTs when we used Viridiplantae databases [29].

Measures for Tree Height, Weevil Attack and Oviposition

The trial was screened for resistance to terminal leader weevil following the method described by [20]. In short, a population of weevils was raised in summer 1999 at the Canadian Forest Service (Pacific Forestry Centre), Victoria and released onto all test trees in fall 1999. Attack rates, egg counts and top kills were recorded in 2000–2004. Growth measurements included the initial tree height in 1995 (year one), and heights in years three, and five as well as leader length in year five preceding the artificial augmentation of the local weevil population in October of the same year (hgt_1995, hgt_1997, hgt_1999, and ldr_99, respectively). Attack rates in 2000 and 2001 (atk_2000, atk_2001) were classified as successful ‘top kills’, ‘failure’ to kill the leader and ‘no attack’ [74]. In addition, for the same years oviposition on the leaders was recorded (egg_2000 and egg_2001) by counting egg punctures into five discrete classes: 1 = 1–25, 2 = 26–50, 3 = 51–75, 4 = 76–100, 5 = 101 and more. Egg punctures contain egg covering fecal plugs and are easily distinguished from feeding punctures, which are not covered. The sums of weevil attacks and oviposition for 2000 and 2001 were also used as ‘resistance’ traits (sum_atk and sum_egg).

Tree material within a replicate block was sampled in a randomized fashion among the plots (i.e. crosses). Terminal leaders from trees in a block were collected in the mornings of May 16, 17 and 18, 2006, respectively. The weather was consistent among these days. Bark/phloem tissue was immediately harvested on site from cut leaders as described previously ([35]; [78]), flash frozen in liquid nitrogen, and stored at −80°C until processed. Total RNA from unattacked individuals was isolated following the protocol of [79] and quantified using NanoDrop® ND-1000 Spectrophotometer; RNA integrity was evaluated using the Agilent 2100 Bioanalyzer. The 21,840 spruce ESTs on the array involved elements from 12 different cDNA libraries, built from different tissues (bark, phloem, xylem), which were under different developmental stages, as well as wound/methyljasmonate treated (ca. 6,500 elements) and untreated (ca. 15,400). Complete details of cDNA microarray fabrication and quality control are described elsewhere (S. Ralph and co-workers, Gene Expression Omnibus database GEO: GPL5423 and http://www.treenomix.ca/). Labeling reactions, hybridizations, slide washes as well as scanning of slides were carried out as described in [35]. Fluorescent intensity data were extracted by using the ImaGene 6.0 software (Biodiscovery, El Segundo, USA). Signal intensity measurements were deposited in the Gene Expression Omnibus database under the accession number GSE22116.

Microarray Experimental Design and Pre-processing of Expression Data

Our experimental design is based on a priori known genotypes. Testing six genotyped crosses and using the previously collected phenotypic data (see above), we determined that genotype differences between most and least resistant progeny were highest in crosses 26, 27, 29 and 32. Since we used two-color microarrays, direct comparisons between Cy3-Cy5 labeled samples were required. A distant pair design for microarray analysis that maximized direct comparisons between different alleles at each locus was originally introduced by [80] and was modified in our study for outbred individuals. We estimated the genetic distance for possible probe-pairs genome-wide by using all segregating SNP loci (122 on average) and such we maximized the number of distant pairs in a given cross. A 25% improvement over random pairing was achieved. We also balanced the two dyes across the three replicate blocks (i.e. sampling on three different days), the different batches of microarray fabrication and different experimenters (see below). Our design resulted in 94 hybridizations profiling 48 individuals in cross 26, 36 in cross 27, and 50 in cross 29 as well as 54 individuals in cross 32 [29].

After quantification of the signal intensities in each array the local background was subtracted for each subgrid. Data were normalized to compensate for non-linearity of intensity distributions using the variance stabilizing normalization method [81]. We performed a single normalization of 188 columns of data. In this way each channel had a similar and array-independent overall expression level and variance. Signal intensities are deposited under the GEO accession number GSE22116. The linear model

with μ as the overall mean was then fit to the normalized intensities of each gene i (hi) in the Cy3 and Cy5 channels to account for technical effects within the experiment (gene-specific ‘dye’ effect, replicate ‘block’, microarray fabrication ‘batch’, experimenter ‘person’ are all fixed effects). The residuals were used in the subsequent QTL analysis. All of the above statistics was carried out using the R statistical package (www.r-project.org).

QTL Detection

A program was written in FORTRAN by K. Ritland for QTL mapping in the 3×2 factorial (for resistance and growth traits) and 2x2 factorial design (for gene expression traits). This program inferred QTL maps for each of the parents of the factorial. QTLs were mapped in the progeny by employing a likelihood function of the trait level (gene expression, other traits) conditioned on genotype of progeny, and compared to the likelihood of unconditioned genotype of progeny (no association of traits with progeny genotype) to give a log-odds (LOD) ratio. Due to the large number of gene expression traits, a single-marker model instead of an interval mapping approach was used, and QTLs were binned into 10 cM marker intervals, thus avoiding having two QTLs assigned to adjacent markers due to linkage of two markers to one QTL. We used R (www.r-project.org) to display QTL density maps. A QTL was significant at LOD ≥3.84 and had to be detected for a minimum of one parent in the factorial ([29], and Table S1). A goodness-of-fit test assuming a uniform distribution was performed to test whether the observed frequencies of eQTLs along the linkage map differed significantly from the expected value. Following the rejection of this null hypothesis ( χ2 = 96678, df = 251, p-value <2.2e-16), we declared “eQTL hotspots” if the number of eQTLs at a given locus exceeded the expected average by 50%. These numbers are significantly above the maximum number within eQTL clusters (i.e. 630) from a randomly generated data set using all 132,100 detected eQTLs, 252 markers, and running 1,000 replicates. The positional candidate genes were identified by collocation of at least 40% of their eQTLs with phenotypic trait QTLs based on the criteria for identifying significant QTLs (see above) and running 10,000 randomizations (p ≤ 0.05).

Other Statistical Analyses

Phenotypic trait correlations were determined using SAS/STAT software, version 9.1.3 of the SAS system for Windows® (SAS Institute Inc., Cary, NC, USA). The cytoscape 2.5.1 plug-in BINGO [82] was used and a hypergeometric test was performed to determine statistically overrepresented Gene Ontology (GO) terms within the GOSlim Plants ontology for spruce genes with Arabidopsis homologs. In our case, this involved comparing the nearest Arabidopsis homologs for all genes that showed significant association of their expression variation with the previously assessed phenotypic trait variation (tree height, weevil attack, and oviposition) to all Arabidopsis homologs on the microarray.

Display of genes that are candidates for different resistance and growth traits (p≤0.05), for at least three phenotypic traits (ldr_99, hgt_1995, hgt_1997, hgt_1999, atk_2000, atk_2001, sum_atk, egg_2000, egg_2001, and sum_egg, respectively).

33.
Drost DR, Benedict CI, Berg A, Novaes E, Novaes CRDB, et al. (2010) Diversification in the genetic architecture of gene expression and transcriptional networks in organ differentiation of Populus. Proceedings of the National Academy of Sciences of the United States of America 107: 8492–8497.

50.
Jennings DB, Ehrenshaft M, Pharr DM, Williamson JD (1998) Roles for mannitol and mannitol dehydrogenase in active oxygen-mediated plant defense. Proceedings of the National Academy of Sciences of the United States of America 95: 15129–15133.