Background

Mitogen-activated protein kinase kinase kinases (MAPKKKs) are the important components of MAPK cascades, which play the crucial role in plant growth and development as well as in response to diverse stresses. Although this family has been systematically studied in many plant species, little is known about MAPKKK genes in wheat (Triticum aestivum L.), especially those involved in the regulatory network of stress processes.

Results

In this study, we identified 155 wheat MAPKKK genes through a genome-wide search method based on the latest available wheat genome information, of which 29 belonged to MEKK, 11 to ZIK and 115 to Raf subfamily, respectively. Then, chromosome localization, gene structure and conserved protein motifs and phylogenetic relationship as well as regulatory network of these TaMAPKKKs were systematically investigated and results supported the prediction. Furthermore, a total of 11 homologous groups between A, B and D sub-genome and 24 duplication pairs among them were detected, which contributed to the expansion of wheat MAPKKK gene family. Finally, the expression profiles of these MAPKKKs during development and under different abiotic stresses were investigated using the RNA-seq data. Additionally, 10 tissue-specific and 4 salt-responsive TaMAPKKK genes were selected to validate their expression level through qRT-PCR analysis.

Conclusions

This study for the first time reported the genome organization, evolutionary features and expression profiles of the wheat MAPKKK gene family, which laid the foundation for further functional analysis of wheat MAPKKK genes, and contributed to better understanding the roles and regulatory mechanism of MAPKKKs in wheat.

Mitogen-activated protein kinase (MAPK) cascades play the crucial role in plant growth and development as well as in response to stresses, which are highly conserved in the signal transduction pathway in eukaryote [1]. The MAPK pathway included three main protein kinase members, namely MAPK kinase kinases (MAPKKK or MEKK), MAPK kinases (MKK or MEK) and MAPKs (MPK). They achieved the function through sequentially being phosphorylated. Upstream signals firstly activated the MAPKKKs, which in turn the MAPKKKs activated the MAPKKs and then specific MAPKs were activated by the MAPKKs. Eventually, the activated MAPKs phosphorylated transcription factors, enzymes or other signaling components to modulate the expression of downstream genes to complete signal amplification [2, 3]. It has been demonstrated that MAPK cascades played a vital role in cell division, growth and differentiation [4, 5], hormone response [6], plant immunity [7, 8], biotic and abiotic stress response and so on [9–11]. To date, extensive studies have been conduct to systematically investigate the MAPKKK gene family in many plant species and it is reported that there were 74 putative MAPKKK genes in maize (Zea mays), 75 in rice (O. sativa), 78 in cotton (G. raimondii) and 80 in Arabidopsis (A. thalianna), respectively [12–15].

Wheat is one of the most important crops worldwide, occupying 17 % of cultivated lands and serving as the staple food source for 30 % of the human population all over the world [16, 17]. Genetically, wheat is an allohexaploid species (2n = 6x = 42), which has a complex original and evolutionary history, derived from three diploid donor species through two naturally interspecific hybridization events. The initial hybridization event was occurred between A genome donor (T. urartu, AA; 2n = 14) and B geome donor (Aegilops speltoides, SS; 2n = 14) to produce the allotetraploid (AABB, T. turgidum L) about 0.2 MYa ago, and then the AABB donor crossed with the D genome donor (A. Tauschii Coss) to form the allohexaploid wheat (AABBDD) about 9000 years ago [18]. As a result, wheat possesses a large and complex genome with three homologous genomes (A, B and D) and the size more than 17 Gb, which makes it a huge challenge to conduct genomic study in wheat. But, as the newly formed polyploidy, wheat is considered as an ideal model for chromosome interaction and polyploidization studies in plants [19, 20]. Recently, the draft genome sequencing of hexaploid wheat Chinese Spring (CS) was completed using the chromosome-based strategy, which laid the foundation to identify wheat gene family at the genome-level and also to discern the homologous copies in these three sub-genomes [17]. The retention and dispersion of homologous gene will provide the indispensable information about chromosome interaction during polyploidization [21, 22].

At present, no systematical investigation of MAPKKK gene family has been performed in wheat. In light of the functional significance of this family, an in silico genome-wide search was conducted to identify wheat MAPKKK gene family in this study. Then, the chromosome localization, gene structure, conserved protein domain, phylogenetic relationship as well as expression profiles and regulatory network were systematically analyzed in the putative wheat MAPKKK genes to reveal the evolutionary and functional features of these genes. Our study will provide a basis for further functional analysis of the wheat MAPKKK genes, and will contribute to better understanding the molecular mechanism of MAPKKKs involving in regulating growth and development as well as stress processes in wheat.

Identification of MAPKKK gene family in wheat

The wheat MAPKKK gene family was identified following the method as described by Rao et al with some modifications [13]. First, all the wheat protein sequences available were downloaded from the Ensemble database (http://plants.ensembl.org/index.html) to construct a local protein database. Then, this database were searched with 304 known MAPKKK gene sequences collected from A.thaliana (80), O. sativa (75), Z. mays (74) and B.distachyon (75) using the local BLASTP program with an e-value of 1e-5 and identity of 50 % as the threshold. Furthermore, all the MAPKKK sequences were aligned and the obtained alignments were used to construct a HMM profile using the hmmbuild tool embedded in HMMER3.0 (http://hmmer.org/download.html), and then the HMM profile were used to search the local protein database using the hmmsearch tool. HMMER and BLAST hits were compared and parsed by manual editing. Furthermore, a self-blast of these sequences was performed to remove the redundancy and the remaining sequences were considered as the putative TaMAPKKK proteins, which then were submitted to the NCBI Batch CD-search database (http://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi) and PFAM databases (http://pfam.xfam.org/) to confirm the presence and integrity of the kinase domain. Finally, all the obtained sequences were verified the existence by BLASTN similarity search against the wheat ESTs deposited in NCBI database. The theoretical pI (isoelectric point) and Mw (molecular weight) of the putative TaMAPKKK were calculated using compute pI/Mw tool online (http://web.expasy.org/compute_pi/). Subcellular localization of each TaMAPKKK cascade kinases were predicted using the TargetP software of the CBS database [23].

Multiple sequence alignments and phylogenetic analysis

Multiple sequence alignments were generated using ClustalW tool [24]. To investigate the evolutionary relationship among MAPKKK proteins, a neighbor-joining (NJ) tree was constructed by MEGA 6.0 software based on the full-length of MAPKKK protein sequences [25]. Bootstrap test method was adopted and the replicate was set to 1000.

Gene structure construction, protein domain and motif analysis

The gene structure information were got from Ensemble plants database (http://plants.ensembl.org/index.html) and displayed by Gene Structure Display Server program (GSDS: http:/gsds.cbi.pku.edu.cn/). The protein domains and motifs in the MAPKKKs were predicted using InterProScan against protein databases (http://www.ebi.ac.uk/interpro/). The schematic representing the structure of all members of TaMAPKKKs was based on the InterProScan analysis.

Chromosomal locations and gene duplication

Genes were mapped on chromosomes by identifying their chromosomal position provided in the wheat genome database. Gene duplication events of MAPKKK genes in wheat were investigated based on the following three criteria: (a) the alignment covered >80 % of the longer gene; (b) the aligned region had an identity >80 %; and (c) only one duplication event was counted for the tightly linked genes [12, 26]. In order to visualize the duplicated regions in the T. aestivum genome, lines were drawn between matching genes using Circos-0.67 program (http://circos.ca/).

Identification of cis-regulatory elements

To investigate the cis-regulatory elements, the upstream regions (2 kbp) of all wheat MAPKKK genes were extracted, which were considered as the proximal promoter regions for the individual wheat MPKKK genes. Then, all the sequences were submitted to PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/Plantcare/html/) to identify the putative cis-acting regulatory elements.

Network interaction analysis

The interaction network which the TaMAPKKK genes involved were investigated based on the orthologous genes between Wheat and Arabidopsis using the AraNet V2 tool (http//www.inetbio.org/aranet/). Then, enrichment analysis was implemented by BiNGO, a cytoscape plugin, for gene ontology analysis and identifying processes and pathways of specific gene sets. Over-represented GO full categories were identified with a significance threshold of 0.01.

The MAPKKK gene expression analysis by RNA-seq data

To study the expression of TaMAPKKK genes in different organs and response to stress, transcriptome sequencing data obtained from WHEAT URGI (https://urgi.versailles.inra.fr/files/RNASeqWheat/) and NCBI Sequence Read Archive (SRA) database were used to investigate the differential expression of TaMAPKKKs. The accession numbers and sample information of the used data were listed in Additional file 1. TopHat and Cufflinks were used to analyze the genes’ expression based on the RNA-seq data [27]. The FPKM value (fragments per kilobase of transcript per million fragments mapped) was calculated for each MAPKKK gene, the log10-transformed (FPKM + 1) values of the 155 TaMAPKKK genes were used for heat map generation. And fold change cutoff of two and p-value < 0.05, q-value < 0.05 were taken as statistically significant threshold [28, 29].

Plant materials, growth conditions, and treatments

The plants of wheat cultivar ‘CS’ were reared in growth chambers at 23 ± 1 °C with a photoperiod of 16 h light/8 h dark. The roots, stems, leaves, spikes (1 d before flowering), and grains (10d after pollination) were collected from flowering plants for tissue expression analysis. One-week-old seedlings which consisted with RNA-seq data were treated by 150 mM NaCl which represented salt treatment, and the seedlings grown under normal condition were used as control. The leaves of seedlings under salt and also control conditions were collected at 0, 6, 12, 24 and 48 h after treatment. All the plant samples from two biological replicates were frozen in liquid nitrogen immediately and stored at −80 °C for RNA isolation.

RNA isolation and qRT-PCR analysis

The total RNA was extracted using Plant RNA Kit reagent (Omega Bio-Tek, USA) according to the manufacturer’s instructions. The RNA integrity was checked by electrophoresis on 1.0 % agarosegels stained with ethidium bromide (EB). The first strand cDNAs were synthesized using a Vazyme Reverse Transcription System (Beijing, China) following the manufacturer’s protocol. Real-time PCR analyses were performed using the primer pairs listed in Additional file 2. Two biological and three technical replicates for each sample were obtained using the real-time PCR system (BIO-RAD CFX96, USA). The β-actin gene was used as internal reference for all the qRT–PCR analysis. Each treatment was repeated three times independently. The expression profile was calculated from the 2–△△CT value [ΔΔCT = (CTtarget/salt – CTactin/salt) – (CTtarget/control – CTactin/control)] [30].

Genome-wide Identification of MAPKKK Family in Wheat

Availability of the genome sequence made it possible for the first time to identify all the MAPKKK family members in wheat. Using the method as described above, a total of 155 genes with the complete kinase domain were identified as the MAPKKK members in the wheat genome. Since there is no standard nomenclature, the predicted wheat MAPKKK genes were then designated as TaMAPKKK1 to TaMAPKKK155 based on the blast scores. It was notable that wheat possessed the largest MAPKKK gene family among the reported species (Table 1), which may be the result of its allohexaploid genome and complex evolutionary process.

Table 1

Comparison of the gene abundance in three subfamilies of MAPKKK genes in different plant species

Species

Raf

MEKK

ZIK

Total

Wheat

115

29

11

155

Arabidopsis

48

21

11

80

Rice

43

22

10

75

Maize

46

22

6

74

Brachypodium

45

24

6

75

Tomato

40

33

16

89

soybean

92

34

24

150

Grapevine

27

9

9

45

Cucumber

31

18

10

59

Canola

39

18

9

66

As reported in Arabidopsis and other plant species [12–15], the MAPKKK gene family could be subdivided into Raf, MEKK and ZIK subfamily according to the specific conserved signature motifs contained by these subfamilies, of which Raf had the signature of GTXX (W/Y) MAPE, ZIK of GTPEFMAPE (L/V) Y, and MEKK of G (T/S) PX (W/Y/F) MAPEV [15, 31]. To validate our prediction and subcategorize the identified wheat MAPKKKs, we further investigated the conserved signature motif in these TaMAPKKKs. Results showed that all the putative wheat MAPKKKs possessed at least one of the three conserved signature motifs (Fig. 1). Among them, 29 genes shared the conserved motif G (T/S) PX (W/Y/F) MAPEV, which were categorized into MEKK subfamily, and 11 had the motif GTPEFMAPE (L/V)Y, belonging to ZIK subfamily as well as the remaining 115 genes shared the motif GTXX (W/Y) MAPE, belonging to Raf subfamily. Then, we further named these gene based on the subfamily categories (Table 2). Moreover, the Raf subfamily is found to be the largest subfamily while the ZIK subfamily had the least members in wheat, which was consistent with the composition of MAPKKK genes in other species.

To support the actual existence of these wheat MAPKKKs, we further performed a BLASTN search against the wheat expressed sequence tag (EST) and unigene database using the MAPKKKs as query. Results showed that most of the TaMAPKKKs’ existences were supported by EST hits except 6 MAPKKKs (TaMEKK4, TaMEKK13, TaMEKK25, TaRaf7, TaRaf53 and TaRaf98). We speculated these 6 not-support TaMAPKKKs might not express under any the used conditions or express with very low level that cannot be detected experimentally. Among the supported TaMAPKKK genes, TaRaf62 has the largest hits of ESTs, with the number of 119, followed by TaMEKK5 and TaRaf87 with the number of 95 and 55 ESTs, respectively.

Chromosome localization analysis found that the 155 TaMAPKKK genes were unevenly distributed on all the 21 wheat chromosomes, of which chromosome 3A contained the most MAPKKK genes with the number of 15, followed by 2A with the number of 14, then 5B, 5D as well as 7D all with the number of 11, while the chromosome 7B had the least MAPKKK gene, with the number of only 1. Furthermore, the length of putative TaMAPKKK proteins ranged from 149 to 1335 amino acids, with the putative molecular weight (Mw) ranging from 16.5 to 146.1 kDa and theoretical isoelectric point (pI) ranging from 4.55 to 9.33, respectively. The subcellular localization analysis found that a total of 51 TaMAPKKKs localized in nuclear, 42 localized in cytoplasmic and 32 localized in plasma membrane, while the remaining were predicted to be located in chloroplast, mitochondrial and extra-cellular (Table 2).

Phylogenetic and conserved domains analysis of TaMAPKKKs

To further evaluate the phylogenetic relationships of the wheat MAPKKK cascade genes, the full-length protein sequences of the 155 TaMAPKKKs were aligned using ClustalW software and then the phylogenetic tree were constructed using the neighbor joining (NJ) method integrated into MEGA6.0 (Fig. 2a). On the basis of phylogenetic analysis, MAPKKKs in wheat were clustered into three major groups, of which MEKK, Raf and ZIK subfamily members clustered together into one category, respectively. It is found that the bootstrap value of the phylogenetic tree is low, which may due to the low similarity of the full-length protein sequences, suggesting that there are high sequence differentiation in these MAPKKK genes although the conserved motifs were included, which was consistent with the MAPKKKs in maize [12], rice [13] and Brachypodium [15, 32]. The conserved domains and phylogenetic relationship suggested that MAPKKK genes showing the closer phylogenetic relationship may have the similar biological function. To date, there is no report regarding MAPKKK genes in T. aestivum, so searching for MAPKKK family genes and understanding their phylogenetic relationship in T. aestivum is necessary and helpful for their further functional study.

Furthermore, the protein domains of these wheat MAPKKK genes were identified by searching against InterProScan databases (Fig. 2c). Results found that each cluster of the MAPKKKs classified by phylogenetic analysis shared the similar protein structure and domain composition, demonstrating that the protein architecture is remarkably conserved within a specific subfamily of MAPKKKs. Protein kinases have been demonstrated to play the crucial role in mediating process of protein phosphorylation, which widely occurred in most cellular activities [32]. In this study, we found all the TaMAPKKK proteins contained a kinase domain (IPR000719), and most of them had the serine/threonine protein kinase active site (IPR008271) in the central part of the catalytic domain. These features were also found in the MAPKKK proteins of rice and cucumber [13, 33], suggesting the conserved function of MAPKKK genes in plants. Moreover, the ATP-binding site, which is located on the catalytic domain, is the most conserved sequences in the kinase family [33]. We found that most of TaMAPKKKs also contained an ATP-binding site (IPR017441), suggesting that these wheat MAPK cascade kinases use ATP as the ligand in signal transduction pathway. In addition, the TaMAPKKKs also had some other conserved domains, such as concanavalin A-like lectin/glucanase domain (IPR013320), armadillo-like helical (IPR011989), and EF-hand domain (IPR011992). Interestingly, these TaMAPKKKs containing the same protein domains were generally clustered into the same clade in phylogenetic analysis, and showed similar expression patterns in response to multiple stresses, which was consistent with the result of BdMAPKKK genes as reported previously [32]. For example, most TaMAPKKK genes containing concanavalin A-like lectin/glucanase domain were up-regulated by drought stress, while those genes containing armadillo-like helical domain showed to be down-regulated under salt stress. These results indicated that the various protein domains could regulate the TaMAPKKK gene to exhibit specific biological functions. The conserved domains identification and analysis may facilitate the identification of functional units in these kinase genes and accelerate to understand their crucial roles in plant growth and development as well as stresses response [34, 35].

Analyses of gene structures and promoter regions of TaMAPKKKs

Gene structure analysis can provide important information about the gene function, organization and evolution [36]. Thus, the exon/intron structures of TaMAPKKK genes were further analyzed using the available wheat genome annotation information and then were displayed by the Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/) (Fig. 2b). We found the exon/intron structures in the TaMAPKKK genes were relatively conserved within the subfamily but some divergent between different subfamily. The Raf and MEKK subfamily have more sophisticated structure than ZIK subfamily due to the various number of intron. In detail, all the ZIK genes had introns, with the number ranging from 1 to 7. In the MEKK subfamily, 3 gene had no intron, and others had 1 to 22 introns, which was the most highly variable in the number of introns in TaMAPKKKs. In the Raf subfamily, 7 out 115 genes had no intron, and other Raf genes had the intron number ranging from 1 to 14. Interestingly, most gene pairs clustered together by phylogenetic analysis shared the similar exon/intron structure and intron phases in these TaMAPKKK genes, suggesting the evolutionary event may impact not only on the gene function but also on gene structure. It has been revealed that intron gain or loss is the results of selection pressures during evolution in plants, and the genes tend to evolve into diverse exon-intron structures and perform differential functions [37, 38]. Accordingly, the wheat MAPKKK genes were found to have the similar exon-intron structure within same subfamily, while the numbers of introns were varied, even within subfamily, which indicated that gene differentiation have occurred in the wheat MAPKKK to accomplish different biological functions under the selection pressure during the wheat genome formation and evolution.

Promoter is the region of the transcription factors (TF) binding site to initiate transcription, which plays a key role in regulating gene spatial and temporal expressions [39]. To further detect the possible biological function and transcription regulation of these TaMAPKKKs, the 2 kb-upstream region of the transcriptional start site of all these genes were extracted and then used to screen for cis-regulatory elements. Results showed that a large number of stress-related and hormone-related cis-elements were found in promoter regions of the wheat MAPKKK genes (Additional file 3), which were similar with the result in Brachypodium, tomato and cucumber [32, 33, 36]. In addition, the abiotic stress-related (a total of 9 drought-stress, 1 salt-stress, 1 heat-stress, 1 cold-stress, 2 wound-stress and 2 disease resistance-related) and hormones signaling transduction-related (6 gibberellins, 4 abscisic acid and 3 ethylene-related) cis-regulatory elements were also found, suggesting that the wheat MAPKKKs may involve in regulating varieties of stress responses and hormone signaling transduction processes.

Genomic distribution and gene duplication of TaMAPKKK gene family

Based on the available wheat genome annotation information, the chromosomal location of the TaMAPKKK genes were further investigated (Fig. 3). A total of 58, 45, and 52 TaMAPKKK genes are distributed in the A, B and D sub-genome, respectively (A > D > B). Initial gene loss may occurred in B genomes following tetraploidy to decrease functional redundancy and define the core wheat genes, with subsequent loss from all three genomes following the formation of the hexaploid around 9000 years ago. The distribution of MAPKKK genes was not random in wheat chromosomes. There were 13, 31, 32, 16, 32, 15 and 16 genes in the group 1 to 7 chromosomes, which show two obvious gradients between group 2, 3, 5 and other four groups. And chromosome 3A had the highest number of MAPKKK genes with the value of 15 genes, whereas chromosome 7B had only one MAPKKK gene. These results indicates that duplication events of MAPKKK gene have likely occurred in wheat 2, 3 and 5 group chromosomes during wheat formation and the evolution of gene families within the different sub-genome is independent, which may associate with gene functions.

Fig. 3

Chromosomal localization and the homologous TaMAPKKK genes in wheat A, B and D sub-genomes. The genes followed by * represent that the gene only anchor to scaffold. Seven homologous groups of wheat chromosomes are displayed in different colors. Duplicated genes of each homo-group are displayed in corresponding color and linked using lines with corresponding color

Gene duplication is frequently observed in plant genomes, arising from polyploidization or through tandem and segmental duplication associated with replication [40]. In our study, a total of 11 homologous gene groups with a copy on each of A, B and D homologous chromosome were found in wheat MAPKKK gene family, and 24 gene pairs with a copy on only 2 of the 3 homologous chromosomes were also identified (Fig. 3 and Additional file 4), while the remaining 74 genes were not found homologs in wheat genome. Previous studies have demonstrated that the fractionation from ploidy caused the loss of some homologous sequences because of some combination of deletion [41]. Our results indicated gene loss may also occur in wheat MAPKKK gene family, resulting in the loss of some homologous copies. The specific retention and dispersion of MAPKKKs in homologous chromosomes provide the invaluable information to better understand the wheat chromosome interaction and polyploidization. Furthermore, these homologous genes are clustered in group 2, 3 and 5 chromosomes, which was consistent with the above chromosome localization analysis, suggesting that group 2, 3 and 5 chromosomes suffered less sequence loss and interaction impact compared to other homologous chromosome groups.

Additionally, 25 pairs of duplication genes from different sub-genomes were also identified (Fig. 4 and Additional file 4), including 3 duplication events within the same chromosome and 22 segmental duplication events between different chromosomes, suggesting that the duplication events could play vital roles in the expansion of the MAPK cascade kinase genes in wheat genome. Interestingly, most duplication events occurred between A and D genomes, except the pair of Raf92 and Raf57 occurred on 5B as well as that of Raf13 and Raf88 from 1B. We postulated that the gene family size of the A and B sub-genome have arrived to balance after first hybridization with the long evolutionary process, but the D sub-genome, which was added to form hexaploid wheat recently, appeared to have more interaction with other two sub-genomes. More interestingly, all the 25 pairs of duplication genes belonging to Raf subfamily, which indicates that gene duplication is a main processes responsible for expanding family size and protein functional diversity [42].

Fig. 4

Duplicated MAPKKK genes pairs identified in wheat. Seven homologous groups of wheat chromosomes are displayed in different colors. Duplicated gene pairs are displayed in corresponding color and linked using lines with the corresponding color

Regulatory network between TaMAPKKK genes with other wheat genes

MAPKKKs, as the first step of MAPK cascade, function as the pivotal component linking upstream signaling steps to the core MAPK cascade and then promote the corresponding cellular responses, which are activated by a diversity of external stimuli and interact with other genes to form the signaling regulatory network in plants [2, 31]. To understand the interactions between TaMAPKKKs and other wheat genes, the regulatory network of them (Fig. 5) was predicted using the orthology-based method [43]. Results showed 18 MAPKKKs (6 TaMEKKs, 8 TaRafs and 4 TaZIKs) were found to have homology with Arabidopsis genes, and corresponding 509 gene pairs of network interactions were detected with the average of 28.3 gene/TaMAPKKK, suggesting the MAPKKKs were widely involved in the regulatory network and metabolic processes in wheat (Additional files 5 and 6). Among them, 149 genes were interacted by TaZIKs, and 212 genes were interacted by TaRafs, as well as 148 genes interacted by TaMEKKs, respectively. TaMEKK27 showed orthologous to Arabidopsis Fused (FU) gene, with an active kinase domain and the C-terminal ARM/HEAT repeat domain. Previously study has revealed that Arabidopsis Fused kinase termed TIO is essential for cytokinesis in both sporophytic and gametophytic cell types [44]. In this study, TaMEKK27 was found to interact with 38 wheat genes, including SOS6, NACK1 and FZR3, suggesting it was also mainly involved in cell proliferation and cytokinesis. TaRaf1 is found to interact with 10 wheat genes, which is homology with Arabidopsis HT1 gene reported to encode an important protein kinase for regulation of stomatal movements and corresponding to CO2, ABA and light [45]. The predicted upstream target genes of TaRaf1 included SLAC1, FMA and CHX20 as well as MYB and NAC transcription factor, which indicated TaRaf1 might play a vital role in ion homeostasis and stress response in wheat. Furthermore, Gene Ontology (GO) functional enrichment of those genes was performed to understand their potential functions. GO descriptions of those interacted genes were involved in diverse biological process, molecular function and stress response. TaMEKK interacted genes were significantly enriched for cellular process and metabolic process, and TaRaf interacted genes were significantly enriched for cellular process and pathways for stress response, while TaZIK interacted genes were functionally enriched in cellular process and protein modification process pathway (Fig. 6a–c), which indicated that TaMAPKKK genes played the vital role in cellular response to external stimuli, especially TaRaf subfamily genes might be the main adaptors to transduce the stress-related signal.

Fig. 5

The interaction network of TaMAPKKK genes in Wheat according to the orthologs in Arabidopsis

Fig. 6

Functional categories of genes in MEKK (a), Raf (b), and ZIK (c) subfamily. FDR-adjusted P values, **P < 0.01, respectively. Observed, numbers of genes observed in this study; Expected, numbers of genes in this same category in the GO enrichment analysis program

Tissue-specific expression patterns of TaMAPKKK genes

Different members of gene families exhibit great disparities in abundance among different tissues to accommodate different physiological processes [46, 47]. To gain insight into the temporal and spatial expression patterns and putative functions of MAPKKK genes in wheat growth and development, the tissue specificity of the 155 TaMAPKKK genes was investigated using available RNA-seq data for five different tissues [48]. Based on the log10-transformed (FPKM + 1) values, we found that the expression levels of the TaMAPKKKs varied significantly in different tissues (Fig. 7). Most MAPKKK genes were found to be expressed in at least one detected organ. All the members in ZIK subfamily were expressed in all of the 5 organs, while a total of 16 Raf genes had too weak expression abundances to be detected in any tissues, which indicated that these genes have undergone functional differentiation and redundancy. Most of MAPKKK genes were much more highly expressed in the root and leaf compared to grain, stem and spike. Furthermore, the tissue-specific expressed MAPKKK genes were identified. A total of 1, 6, 1, 6 and 3 genes were found to be specifically expressed in grain, root, stem, leaf and spike, respectively. Among them, TaRaf112 was predominantly expressed in grain and spike, TaMEKK25 showed preferential expression in stem and leave, and TaRaf12, TaRaf33 as well as TaRaf73 showed preferential expression in root and leave. As shown in Fig. 7 and Additional file 7, most homologous and duplication genes showed similar expression pattern during development. However, it also should be noted that many clustering of expression profiles does not reflect gene similarities, including the copies of one MAPKKK gene from sub-genomes and duplication genes from different sub-genomes. Some of them even show converse expression patterns. For instance, TaRaf71 which located in 3A showed preferential expression patterns in the root, stem, leaf and spike, whereas its homology gene TaRaf113 from 3B was only expressed in the grain. TaMAPKKK23 in 5A was expressed in all tested organs with relatively higher abundance, while its homology TaMAPKKK25 from 5B only slightly expressed in stem and leaf. The divergences in expression profiles between homologous genes revealed that some of them may lose function or acquire new function after polyploidy and duplication in the wheat evolutionary process.

Fig. 7

Hierarchical clustering of the expression profiles of all TaMAPKKK genes in five different organs or tissues (grain, root, stem, leaf and spike). Log10-transformed (FPKM + 1) expression values were used to create the heat map. The red or green colors represent the higher or lower relative abundance of each transcript in each sample

Expression patterns of TaMAPKKK genes under abiotic stresses

Extensive studies have revealed that the MAPKKK genes played a crucial role in response to abiotic stresses in plant [10, 49, 50]. In the present study, expression patterns of all TaMAPKKK genes in response to four abiotic (salt, heat, drought, cold) stresses were investigated using RNA-seq data to study the roles of TaMAPKKK genes in the response to abiotic stresses. Overall, all the 155 wheat MAPKKK genes showed differential expression patterns under these conditions and most of them were up-regulated in response to more than one stress (Figs. 8, 9 and 10). Among them, TaMEKK14, TaRaf10, TaRaf34 and TaRaf53 showed specific-expression under salt stress, while TaRaf87 and TaRaf105 specifically expressed under drought stress. Meanwhile, TaRaf36 and TaRaf49 were specifically expressed under cold stress while TaRaf112 were specifically expressed under heat stress. In addition, some down-regulated TaMAPKKKs were also observed. TaMEKK29, TaRaf22, TaRaf41, and TaRaf73 was down-regulated under salt stress (Fig. 8), TaMEKK29 showing down-regulated under heat stress, while TaRaf44, TaRaf72 and TaRaf80 showing down-regulated under heat and drought stress (Fig. 9), as well as TaMEKK13, TaRaf1 and TaZIK10 were down-regulated under cold stress (Fig. 10), respectively. These stress-induced MAPKKK genes provided the valuable information to further reveal the roles of TaMAPKKKs playing in regulating wheat diverse stress processes. Finally, the most of the homologous and duplication gene pairs such as TaRaf110/TaRaf32/TaRaf15, and TaMEKK18/ TaMEKK19/ TaMEKK20 showed the similar expression pattern under these stress treatments, suggesting that these had similar physiological functions. On the other hand, several gene pairs such as TaRaf83/TaRaf42 and TaRaf17/TaRaf74, exhibited different expression patterns under the same stress treatments, suggesting functional differentiation has been occurred in these genes and they involved in regulating different stress signaling pathways.

Fig. 8

Hierarchical clustering of the expression profiles of all 155 TaMAPKKK genes under salt stress treatments. Log10-transformed (FPKM + 1) expression values were used to create the heat map. The red or green colors represent the higher or lower relative abundance of each transcript in each sample. Fold change cutoff of two and p-value < 0.05, q-value < 0.05 were taken as statistically significant

Fig. 9

Hierarchical clustering of the expression profiles of all TaMAPKKK genes under drought and heat stress treatments. Log10-transformed (FPKM + 1) expression values were used to create the heat map. The red or green colors represent the higher or lower relative abundance of each transcript in each sample. Fold change cutoff of two and p-value < 0.05, q-value < 0.05 were taken as statistically significant

Fig. 10

Hierarchical clustering of the expression profiles of all TaMAPKKK genes under cold stress treatments. Log10-transformed (FPKM + 1) expression values were used to create the heat map. The red or green colors represent the higher or lower relative abundance of each transcript in each sample. Fold change cutoff of two and p-value < 0.05, q-value < 0.05 were taken as statistically significant

Validation of the expression of TaMAPKKKs by qRT-PCR analysis

Gene expression patterns usually provide the important clue for its function. Though expression profiles analysis based on RNA-seq data, the differentially expressed TaMAPKKKs among different tissues and stresses were obtained. To further verify the expression levels of these TaMAPKKKs, 10 differentially expressed genes in tissues and 4 salt-responsive genes were randomly selected to detect their expression levels through qRT-PCR analysis (Fig. 11). Among five tissues, TaMEKK5 was found to be expressed in all tested materials with relatively higher abundance. TaMEKK14, TaMEKK21 and TaMEKK23 were found to show a relatively high expression level in the spike comparing with other four tissues, whereas TaRaf80 exhibited the high abundance in the leaf and TaRaf87 showed high expression levels in root and leaf (Fig. 11a). Under salt stress, TaRaf34 was found to be significantly up-regulated while TaRaf22, TaRaf4 and TaMEKK29 were down-regulated under salt stress condition (Fig. 11b). The qRT-PCR results were highly consistent with that of RNA-seq data, suggesting it is reasonable to use RNA-seq data to assess the expression level of transcripts in wheat and the validated tissues-specific and salt-responsive TaMAKKK provided the candidates for further study of their function in wheat development and stress response.

Fig. 11

Validation of the expression level of TaMAPKKKs by qRT-PCR analysis. a The relative expression levels of the 10 selected TaMAPKKKs in different tissues; b The relative expression levels of the 4 TaMAPKKKs under salt treatment

This study for the first time identified and characterized the wheat MAPKKK gene family. Through a genome-wide search using the latest available wheat genome information, a total of 155 putative TaMAPKKKs were obtained, which classified into MEKK, ZIK and Raf 3 subfamilies based on the conserved motif signatures. The gene structure, conserved protein domain as well as phylogenetic relationship of these TaMAPKKKs were systematically analyzed and strongly supported the classification. The homologous genes between wheat A, B and D sub-genome and gene duplication were also investigated, which was found to be the main factors contributing to the expansion of wheat MAPKKK gene families. Furthermore, the expression profiles of wheat MAPKKKs during development and under abiotic stresses were investigated and the tissue-specific or stress-responsive TaMAPKKK genes were identified. Finally, 6 tissue-specific and 4 salt-responsive TaMAPKKK genes were selected to validate their expression level through qRT-PCR analysis, which provided the important candidates for further functional analysis of MAPKKK genes in wheat development and stress response. Our current study systematically investigated the genome organization, evolutionary features, regulatory network and expression profiles of the wheat MAPKKK gene family, which not only lay the foundation for investigating the function of these MAPKKKs, but also facilitate to reveal the regulatory and evolutionary mechanism of MAPK cascade involving in growth and development as well as in response to stresses in wheat.

Acknowledgment

This research was mainly funded by the National Natural Science Foundation of China (Grant NO: 31561143005 and 31401373), and partially supported by the 863 program (2012AA10A308) from the Chinese of Ministry of Science & Technology.

Authors’ contributions

NXJ and SWN designed the study and supervised the experiment. WM performed the bioinformatic analysis and prepared the manuscript. YH collected experimental materials. FKW and DPC conducted QPCR analysis. NXJ revised and improved the draft. All the authors read and approved the final manuscript.

Competent interest

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Additional files

Additional file 1:Accession number and samples information of RNA-seq data using in this study. (XLSX 10 kb)