Abstract

Plant hormones modulate plant growth, development, and defense. However, many aspects of the origin and evolution of plant hormone signaling pathways remain obscure. Here, we use a comparative genomic and phylogenetic approach to investigate the origin and evolution of nine major plant hormone (abscisic acid, auxin, brassinosteroid, cytokinin, ethylene, gibberellin, jasmonate, salicylic acid, and strigolactone) signaling pathways. Our multispecies genome-wide analysis reveals that: (1) auxin, cytokinin, and strigolactone signaling pathways originated in charophyte lineages; (2) abscisic acid, jasmonate, and salicylic acid signaling pathways arose in the last common ancestor of land plants; (3) gibberellin signaling evolved after the divergence of bryophytes from land plants; (4) the canonical brassinosteroid signaling originated before the emergence of angiosperms but likely after the split of gymnosperms and angiosperms; and (5) the origin of the canonical ethylene signaling pathway postdates shortly the emergence of angiosperms. Our findings might have important implications in understanding the molecular mechanisms underlying the emergence of land plants.

ABA plays an important role in the responses of plants to environmental stresses, especially drought (Fujii et al., 2009). The PYRABACTIN RESISTANCE (PYR)/PYRABACTIN RESISTANCE-LIKE (PYL)/REGULATORY COMPONENT OF ABSCISIC ACID RECEPTOR (RCAR) family of START proteins (PYRs for short) has been identified as ABA receptors (Fujii et al., 2009). In the absence of ABA, the positive regulator SUBCLASS III SUCROSE NONFERMENTING1-RELATED PROTEIN KINASE2 (SnRK2) is inactivated by group A PROTEIN PHOSPHATASE2C (PP2C) through dephosphorylation. The binding of ABA to receptors breaks the interaction of group A PP2C and SnRK2, which activates the function of SnRK2 by autophosphorylation. The activated SnRK2 phosphorylates the ABA-responsive transcription factors, such as ABSCISIC ACID-RESPONSIVE ELEMENT-BINDING FACTORs (ABFs) and ABSCISIC ACID-INSENSITIVE5 (ABI5). Moreover, ABSCISIC ACID-INSENSITIVE5-BINDING PROTEINs (AFPs) facilitate the degradation of ABI5 (Kelley and Estelle, 2012). There are three different types of ABA transporters, ABSCISIC ACID-IMPORTING TRANSPORTER (AIT), ATP-BINDING CASSETTE G25 (ABCG25), and PLEIOTROPIC DRUG RESISTENCE (PDR; Boursiac et al., 2013).

ETH regulates both development and defense processes of plants, such as fruit ripening, senescence, abscission, and the responses to biotic and abiotic stresses (Qiao et al., 2012). ETH is perceived by its receptor ETHYLENE RESPONSE1 (Chang et al., 1993). In the absence of ETH, the receptors promote the kinase activity of CONSTITUTIVE TRIPLE1 (CTR1). CTR1 phosphorylates the positive regulator ETHYLENE INSENSITIVE2 (EIN2), repressing the ETH responses. EIN3 is degraded via the SCF E3 ligase complex with ETHYLENE INSENSITIVE3-BINDING F-BOX PROTEIN1/2 (EBFs), whereas EIN2 is degraded via the SCF complex with ETHYLENE INSENSITIVE2-TARGETING PROTEIN1/2 (ETP1/2). ETH promotes the accumulation of EIN2 by down-regulating ETP1/2. In the presence of ETH, the receptors and CTR1 are inactivated, triggering the dephosphorylation and cleavage of EIN2. The C-terminal domain of EIN2 is translocated to the nucleus, where it stabilizes EIN3. EIN3 triggers the ETH responses by binding to ETHYLENE RESPONSE FACTOR1/2 (Kendrick and Chang, 2008). MITOGEN-ACTIVATED PROTEIN KINASE KINASE KINASE (MAP3K) is a negative regulator in the ethylene response (Kumar and Sharma, 2014).

GAs regulate the germination, elongation growth, and sex determination of plants (Murase et al., 2008). When GAs are absent, their receptor GIBBERELLIN-INSENSITIVE DWARF1 (GID1) is in the passive state and the repressor DELLA-CONTAINED PROTEIN (DELLA) inhibits the activity of the transcription factor PHYTOCHROME-INTERACTING FACTOR (PIF) and thus represses the GA responses. The bioactive GAs can change the conformation of their receptor GID1, which increases the affinity between GID1 and DELLA and forms the GA-GID1-DELLA complex (Murase et al., 2008; Shimada et al., 2008). The complex increases the interaction between DELLA and SLEEPY1/2 (SLY1/2). Consequently, DELLA is degraded by the ubiquitin-proteasome pathway via the SCFSLY1/2 complex. The degradation of DELLA releases PIFs, permitting the GA responses (Santner and Estelle, 2009). SPINDLY negatively regulates the GA responses, potentially by stabilizing the DELLA protein (Sun, 2008).

JAs control plant defense against wounding, herbivores, and certain pathogens and are also crucial for plant fertility and reproduction (Sheard et al., 2010). JASMONATE RESISTANT1 (JAR1; also known as GH3-11) conjugates JAs into biologically active status. The active JAs are perceived by their receptor CORONATINE INSENSITIVE1 (COI1; Sheard et al., 2010). In the absence of active JAs, JASMONATE ZIM-DOMAIN PROTEINs (JAZs), NOVEL INTERACTOR OF JASMONATE ZIM-DOMAIN PROTEIN (NINJA), and TPL work together to bind the transcription factor MYC-RELATED TRANSCRIPTIONAL ACTIVATOR (MYCs) and inhibit their function (Pauwels et al., 2010). Binding the active JAs, COI1 interact with JAZs, which leads to the degradation of JAZs by the SCFCOI1 complex and the release of MYCs (Pauwels et al., 2010). Consequently, the JA-responsive genes are transcribed by MYCs.

SA regulates the immunity of plants and induces systemic acquired resistance when plants are facing pathogen challenge (Fu et al., 2012). The SA response mechanism was characterized in the model plant Arabidopsis (Arabidopsis thaliana) recently (Fu et al., 2012). NONEXPRESSER OF PATHOGENESIS-RELATED GENE1 (NPR1) is a master regulator of systemic acquired resistance. NPR3 and NPR4 have been identified as the SA receptors (Fu et al., 2012). They perceive SA with different affinities: NPR4 has a higher affinity to SA than NPR3. When the SA concentration is low, NPR4 binds SA instead of NPR3, which in turn inhibits the degradation of NPR1 and allows the basal resistance. The pathogen challenge increases the SA level, forming an SA concentration gradient (Fu et al., 2012). In the infection sites with the highest SA level, SA binds to NPR3, which promotes NPR1 degradation to allow programmed cell death. In the cells with lower SA level, NPR1 accumulates and activates the transcription factors, such as WRKYs and TGACG SEQUENCE-SPECIFIC BINDING PROTEINs (TGAs), to turn on the defense-related genes (Ülker and Somssich, 2004; Fu et al., 2012).

The accumulating plant genome-scale data and the understanding of plant hormone signaling biology offer a great opportunity to investigate the origin and evolution of the plant hormone signaling machinery. We focus on nine major plant hormone signaling pathways (i.e. ABA, AUX, BR, CK, ETH, GA, JA, SA, and SL signaling pathways) because these signaling pathways have been well characterized (Santner and Estelle, 2009; Fig. 1). We performed multispecies comparative genomic and phylogenetic analyses of the important components of each plant hormone signaling pathway using genome-scale data of plant species that represent all the major plant lineages (Supplemental Table S1). Our study provides a comprehensive view of the origin and evolutionary mechanisms of these major plant hormone signaling pathways.

RESULTS

Plant Species and Ortholog Identification

Our data set (Supplemental Table S1) includes the complete genome sequences of nine species that represent all the major plant lineages (Chlamydomonas reinhardtii, Volvox carteri, Ostreococcus tauri, Cyanidioschyzon merolae, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Arabidopsis, and rice). We also used the transcriptome data of one liverwort (Marchantia polymorpha; Sharma et al., 2013) and four charophytes (Nitella hyalina, Nitella mirabilis, Penium margaritaceum, and Spiragyra pratensis; Timme et al., 2012). Charophytes are a group of freshwater green algae from which land plants were descended (Qiu et al., 2006; Finet et al., 2010). The previous studies of plant hormone signaling evolution (Rensing et al., 2008; Pils and Heyl, 2009; De Smet et al., 2011) used P. patens as the most basal lineage of land plants. Here, we used the liverwort M. polymorpha as the most basal plant lineage, as liverworts are the sister group of all the other land plants (including P. patens) and represent the earliest branching land plant lineage (Qiu et al., 2006).

To determine the origin of plant hormone signaling machinery, we identified the orthologs of plant hormone signaling components in the 14 plant species (Supplemental Table S1), because orthologs are more likely to retain similar functions in the course of evolution (Tatusov et al., 1997; Chen and Zhang, 2012). Homologs are genes/proteins that share a certain degree of similarity with each other, whereas orthologs are genes/proteins of different species that originated from a single gene of the last common ancestor (Chen and Zhang, 2012). Homologs do not necessarily mean they are orthologs, because homologs might share sequence similarity only in certain regions (e.g. conserved domains). Thus, high similarity between the proteins of different species does not guarantee that they are orthologs and/or share similar functions (Tatusov et al., 1997; Chervitz et al., 1998). Here, we employed a combination of similarity search, phylogenetic reconstruction, and conserved domain analysis approaches (see “Materials and Methods”) to systemically identify the orthologs of plant hormone signaling components. First, we used the BLAST algorithms with the Arabidopsis (occasionally P. patens) proteins as queries to identify their homologs in the genomes of the other plant species. Next, we performed phylogenetic analysis to identify potential orthologs. Third, because many protein domains have a degree of evolutionary independence and can combine with each other in different forms (Chervitz et al., 1998), we compared their conserved domain architecture with that of Arabidopsis proteins. Only the potential orthologs containing the same conserved domain architecture as the Arabidopsis proteins are defined as orthologs in this study. The orthologs identified using this approach are likely to have similar functionality, given that homologs sharing the same domain architecture tend to have similar functions (Hegyi and Gerstein, 2001). Because gene expression is usually tissue specific, developmental stage specific, and/or stress induced, transcriptome data (especially when its coverage is low) sometimes cannot recover all the gene information of an organism’s genome (Wang et al., 2014). Therefore, regarding the ortholog identification with transcriptome data of M. polymorpha and four charophytes (N. hyalina, N. mirabilis, P. margaritaceum, and S. pratensis), the presence of orthologs indicates the presence of orthologs but the absence of orthologs does not necessarily indicate the absence of orthologs (Wang et al., 2014).

Our comparative analyses identified the orthologs of all the core components mediating the ABA response (ABFs, AFPs, group A PP2C, PYRs, and SnRK2s) and two ABA transporters (ABCG25 and AITs) in the land plant genomes (Fig. 2). Algal genomes encode most of the ABA response and transport components, including ABFs, group A PP2C, SnRK2s, PDRs, ABCG25, and AITs. Because PDRs can be identified in the genome of K. flaccidum, their absence in other charophytes and M. polymorpha is possibly due to the short nature of the assembled transcriptome sequences. The orthologs of AFPs and the ABA receptor PYR proteins were identified only in the land plant genomes, suggesting that both proteins first occurred in land plants. Interestingly, the predicted structures of the P. patens and M. polymorphaABA receptor orthologs share significant similarity with the experimental structure of the Arabidopsis protein (TM score = 0.307 and 0.333, respectively; Supplemental Fig. S1). The binding sites of the land plant ABA receptor orthologs are well conserved (Supplemental Fig. S2). These results suggest that the ABA signaling machinery originated at least in the last common ancestor of land plants (Fig. 3). Consistent with our hypothesis, the group A PP2C-mediated ABA response mechanism appears to function in M. polymorpha (Tougane et al., 2010).

The distribution of plant hormone signaling components in plants. The rings and solid circles indicate the presence of homologs and orthologs, respectively. Superscript 1 indicates that a significant hit can be gained by similarity searches using the P. patens proteins as BLAST queries. Protein names in blue and purple indicate these mediating plant hormone signal transduction and transport, respectively. Columns in green and light blue indicate genome sequences and transcriptome data used for similarity searches, respectively.

The origin of nine major plant hormone signaling pathways in plants. The green bars indicate the emergence of all the components of each specific hormone signaling machinery. Branches in blue and purple indicate algae and land plants, respectively. The phylogenetic relationship among the plant species used in this study is based on Qiu et al. (2006), Finet et al. (2010), and Bowman (2013).

The previous analysis of the moss (P. patens) genome suggests that the AUX signaling machinery originated before the divergence of moss and vascular plants (Rensing et al., 2008). Further analyses of the charophyte EST library identified several, but not all, components (including ARFs, AUX/IAAs, and PINs) of the AUX signaling and transport machinery, which leads to the hypothesis that the canonical AUX signaling was, at least in part, already present in the charophyte genomes (Lau et al., 2009; De Smet et al., 2011; Viaene et al., 2013). Our comparative genomic analyses show that the genomes of K. flaccidum and land plants encode the orthologs of all the core components (except ABP1 in M. polymorpha and ARFs in K. flaccidum) mediating the AUX response and transport (Fig. 2). The ABP1 protein ortholog was not found in the transcriptome assembly of M. polymorpha, but it can be found in several charophytes (N. mirabilis, P. margaritaceum, and K. flaccidum). This distribution pattern might be due to either the specific loss of ABP1 in the lineage of M. polymorpha or, more likely, the low coverage or short nature of M. polymorpha transcriptome data. As for the ARF protein, the ARF ortholog was not identified in the K. flaccidum genome but was found in the genome of N. mirabilis (Fig. 2; Supplemental Fig. S1). Interestingly, the AUX/IAA ortholog of K. flaccidum contained all the typical conserved domains (I, II, III, and IV) of AUX/IAA proteins identified in previous studies (Supplemental Fig. S2); typical domains I, II, III, and IV act as transcriptional repressor, degradation motif, dimerization domain, and dimerization domain, respectively (Paponov et al., 2009). The AUX/IAA ortholog of N. mirabilis clusters together with land plant AUX/IAA proteins, although it lacks domains I and II, which might be due to the short nature of the assembled transcriptome sequences. Nevertheless, all the core AUX signaling components seem to be already present in charophytes. Moreover, the main AUX biosynthesis pathway (TAA/YUC) already existed in the genome of K. flaccidum (Wang et al., 2014). The AUX hormone has been found to be present in algal lineages and to have effects on algal development (Tarakhovskaya et al., 2007; Lau et al., 2009). These lines of evidence suggest that AUX signaling is likely to originate in charophytes and thus to account for the mechanism of the AUX responses in algal species (Fig. 3).

Our comparative genomic analysis shows that the orthologs of eight of the 10 BR signaling transduction components (i.e. BAK1, BIN2, BRI1, BSU1, BZRs, CDG1, GRFs, and PP2A proteins) are present in the algal genomes (Fig. 2), indicating that these proteins arose before the emergence of land plants. The orthologs of BKI1 are found only in the genomes of angiosperms (Arabidopsis and rice). Furthermore, we identified a BKI1 protein ortholog in the genome of Amborella trichopoda, which is the sister species to all other extant angiosperms (Soltis et al., 2008; Finet et al., 2010; Amborella Genome Project, 2013). However, no BKI1 homolog is found in the available EST data of Pinaceae species, although the absence of homologs in the Pinaceae EST data does not necessarily indicate the absence of homologs in the Pinaceae genomes. Nevertheless, the BKI1 protein seems to de novo originate at least in the last common ancestor of extant angiosperms but after the divergence of lycophytes from land plants (likely the split between gymnosperms and angiosperms). The orthologs of BR receptors, BRI1 and BAK1, were identified in the genomes of K. flaccidum and land plants. The predicted structures of BRI1 orthologs of K. flaccidum and land plants share significant similarity with the Arabidopsis protein structure (TM score = 0.194). The BR-binding site is highly conserved among the land plant BRI1 orthologs (Supplemental Fig. S2). These results demonstrate that the canonical BR signaling machinery arose before the emergence of angiosperms and possibly after the split between gymnosperms and angiosperms (Fig. 3).

The orthologs of all the core CK signaling components were identified in the genomes of land plants and the charophyte K. flaccidum. The entire core CK signaling and transport components cannot be identified in the transcriptomes of the other four charophytes (Fig. 2). Again, this might be due to the low coverage of transcriptome data and/or the short nature of the assembled transcriptome sequences.

A previous study suggests that the CHASE domain of AHKs appeared only after plants conquered the land (Pils and Heyl, 2009). Further analyses of the S. pratensis EST data provide hints at the origin of CK receptors in charophytes (Gruhn et al., 2014). Our analysis here shows that the AHK orthologs are present in the genomes of the charophytes N. mirabilis and K. flaccidum (Fig. 2), although the binding sites seem not to be well conserved among AHK orthologs (they are not well conserved among Arabidopsis AHKs; Supplemental Fig. S2). The orthologs of the CHASE domain-containing AHKs were not identified in other noncharophyte algal species (Supplemental Fig. S1). Therefore, the CHASE domain seems to be acquired by the progenitor of AHKs in charophytes. To elucidate the origin of the CHASE domain in plants, we reconstructed a phylogenetic tree of all currently available CHASE domain sequences. Our phylogenetic tree shows that the CHASE domain sequences of charophyte and land plant AHKs cluster together (Shimodaira-Hasegawa test [SH-like] value = 0.808) and nest within the bacterial diversity (Fig. 4). The phylogenetic analysis and the absence of the CHASE domain in red algal and other green algal genomes suggest that plant species acquired the CHASE domain not through the ancient primary endosymbiosis that gave birth to plastids (Reyes-Prieto et al., 2007) but through a relatively recent horizontal gene transfer event from bacteria to charophyte species. An alternative hypothesis explaining the observed phylogenetic pattern would involve numerous independent losses of the CHASE domain in red algae, green algae, and animals, which is highly unlikely. Taken together, our results suggest that CK signaling originated in charophyte species (Fig. 3).

The phylogenetic relationship of the CHASE domains of prokaryotic and eukaryotic origin. The phylogeny is an approximate maximum-likelihood (ML) tree reconstructed using FastTree 2 based on the CHASE domain sequences of both prokaryotes and eukaryotes. SH-like values are shown near selected nodes.

The orthologs of the core ETH signaling components were identified within the land plant genomes (except CTR1, EBFs, and ETP1/2; Fig. 2). As for EBFs, their orthologs were found in the genomes of several charophyte species and all the land plants except M. polymorpha. The EIN2 protein is present in the genomes of land plants and several charophytes. The C terminus of EIN2 contained a putative nuclear localization signal (NLS) for nuclear translocation (Wen et al., 2012). However, the alignment of the C terminus of EIN2 shows that the NLS exists only in land plants (except M. polymorpha; Supplemental Fig. S2). The absence of EBFs and the NLS of EIN2 in M. polymorpha might be readily explained by the low coverage/short nature of the transcriptome data. In the case of ETP1/2, a conserved domain analysis shows that the Arabidopsis and rice ETP1/2-like proteins have two domains, F-box and FBA_1 (Supplemental Table S2; Supplemental Fig. S1). The FBA_1 domain interacts directly with EIN2 and is important for the ETP1/2 proteins to recognize their substrates (Qiao et al., 2009). The ETP1/2 homologs in the S. moellendorffii and P. patens genomes do not have the FBA_1 domain. Moreover, we also found that the homologs of ETP1/2 in A. trichopoda (the most basal angiosperm lineage) do not contain the FBA_1 domain. These results suggest that the FBA_1 domain was gained by the ETP1/2 progenitor shortly after the emergence of angiosperms but before the evolutionary split of monocots and eudicots. Therefore, the origin of canonical ETH signaling postdates shortly the emergence of angiosperms but predates the divergence of monocots and eudicots (Fig. 3).

It appears that the receiver domain of ETH receptors is well conserved among algal species and land plants (Supplemental Fig. S2). The ETH-binding activity of ETH receptor-like proteins has been detected in all land plants, including M. polymorpha and P. patens (Wang et al., 2006). The ETH-mediated submergence responses also have been found in mosses (Yasumura et al., 2012). These facts indicate that there might be an alternative ETH response pathway that does not depend on EIN2-ETP1/2 in moss or even some charophytes.

The higher plants were thought to acquire the ETH receptors through endosymbiosis, because similar proteins are found in the cyanobacterium from which the plant chloroplast arose (Mount and Chang, 2002; Schaller et al., 2011). However, our phylogenetic analyses of their protein domains show that the cyanobacterial proteins are not the closest relatives of ETH receptors of plants (Supplemental Fig. S3) and thus do not support the plastid origin of ETH receptors.

GA Signaling Originated after the Split of Bryophytes from Land Plants

Our comparative analysis shows that the land plant (except M. polymorpha) genomes encode the orthologs of all the core components of GA signaling. The homologs of DELLA in the M. polymorpha genome do not share the same domain architecture as the Arabidopsis proteins (Supplemental Fig. S1). But it is still unclear whether the DELLA domain is truly missing in the homologs or if the absence of the DELLA domain is due to the short nature of the assembled transcriptome sequences. Interestingly, the orthologs of DELLA and GID1 proteins of P. patens contain no protein-protein interaction regions (DELLA, LExLE, and VHYNP motifs in the DELLA protein and residues Ile-24, Phe-27, and Tyr-31 in the GID1A protein; Supplemental Fig. S2; Murase et al., 2008), which is consistent with previous experimental studies showing that GID1 and DELLA do not interact with each other in P. patens (Hirano et al., 2007; Yasumura et al., 2007; Hayashi et al., 2010). It follows that GA signaling originated after the evolutionary split of bryophytes and land plants (Yasumura et al., 2007).

It appears that the algal genomes encode the orthologs of three JA signaling components (COI1, TPL, and MYCs), while the land plant genomes encode the orthologs of all the core JA signaling components. The TM scores of the predicted structures of P. patens and M. polymorpha COI1 orthologs and the experimental structure of the Arabidopsis protein are 0.167 and 0.215, respectively. Moreover, all the predicted structures contain a loop region that forms an abrupt kink in the middle, which plays a key role in binding JA (Sheard et al., 2010). Sequence alignment shows that the binding sites are well conserved between P. patens and M. polymorpha COI1 orthologs and Arabidopsis COI1 proteins (Supplemental Fig. S2). Therefore, we hypothesize that JA signaling originated in the last common ancestor of land plants (Fig. 3).

Our analysis shows that the algal genomes encode the orthologs of at least two SA signaling components (i.e. TGAs and WRKY proteins) but not the orthologs of NPR proteins. Initial conserved domain analysis shows that the BTB domain is absent in the M. polymorpha NPR-like protein. However, a close analysis of the BTB domain and M. polymorpha NPR sequences suggests that the M. polymorpha NPR protein encodes partial BTB (Supplemental Fig. S2), which might be due to the short nature of the assembled transcriptome sequences. Thus, the NPR protein ortholog seems to be present in the M. polymorpha genome, and the land plant genomes encode the orthologs of all the core components of SA signaling. These results suggest the origin of SA signaling transduction in the last common ancestor of land plants (Fig. 3).

Interestingly, our phylogenetic analyses show that both the paralog NPR1/2 proteins and the paralog NPR3/4 proteins were generated by a gene duplication event in the last common ancestor of the Brassicaceae (Fig. 5). The NPR1/2 progenitor diverged from the NPR3/4 progenitor after the split between lycophytes and angiosperms (Fig. 5). Of note, the monocot genomes have two copies of NPR3/4-related proteins, which are generated by a gene duplication event different from the one generating Brassicaceae NPR3/4 proteins (Fig. 5). Within the model plant Arabidopsis, the SA signaling mechanism is as follows: NPR1 is a master regulator of systemic acquired resistance, while NPR3 and NPR4 act as the SA receptors (Fu et al., 2012). When the SA concentration is low, NPR4, but not NPR3, binds SA, inhibiting the degradation of NPR1. During pathogen infection, in the infection site with the highest SA level, SA binds to NPR3, promoting NPR1 degradation to allow programmed cell death; in cells with lower SA levels, NPR1 accumulates and activates the transcription factors to turn on the defense-related genes (Ülker and Somssich, 2004; Fu et al., 2012). However, we found that the NPR copy number differs in land plant genomes. For example, there are two copies of the NPR4-related protein and two copies of the NPR1-related protein in the Thellungiella halophila genome, only one copy of the NPR3/4-related protein in the grape (Vitis vinifera) genome, one copy of the NPR1/2-related protein in the monocot genomes, and one and two copies of the NPR protein in the S. moellendorffii and P. patens genomes, respectively (Fig. 5). Especially, with only one NPR3/4-related protein in the genome of grape, the SA signaling logic characterized in Arabidopsis cannot work, given that the NPR3/4 proteins play different roles under different SA concentrations. It seems that species with different NPR copy numbers might have different SA response mechanisms from Arabidopsis.

The phylogenetic relationship and domain architecture of SA receptors. The phylogeny is an ML tree reconstructed based on homologs of SA receptors, NPRs. Bootstrap values (ML/neighbor-joining) are shown near the nodes. Domain architecture was mapped near the protein. Different domains are represented by rectangles with different colors.

Our systematic analysis shows that the genomes of green algae and land plants encode the orthologs of MAX2 (D3), D14, and SMXLs (D53); that the TCP transcription factor BRC1 protein was identified in the charophyte and land plant genomes; and that the orthologs of all core components of SL signaling transduction were identified in the genomes of the charophyte N. mirabilis and land plants. The predicted structures of the SL receptor orthologs of P. patens, M. polymorpha, and three charophytes (N. mirabilis, K. flaccidum, and S. pratensis) are highly similar to the experimental structure of the rice D14 protein (TM scores > 0.17). Moreover, the binding sites of these SL receptor orthologs are well conserved (Supplemental Fig. S2). Therefore, our results demonstrate that SL signaling originated in the charophyte lineages (Fig. 3). In accordance with our comparative genomic analyses of the SL signaling components, hormone SLs were detected in charophytes and liverworts, but not in other green algae, and have important functions in the rhizoid elongation of charophytes and liverworts (Delaux et al., 2012).

DISCUSSION

Our comparative genomic and phylogenetic analyses indicate the following: (1) the AUX, CK, and SL signaling pathways originated in charophytes; (2) the ABA, JA, and SA signaling pathways evolved in the most recent common ancestor of land plants; (3) GA signaling originated after the divergence of bryophytes from land plants; (4) the canonical BR signaling originated in the last common ancestor of angiosperms, likely after the split between gymnosperms and angiosperms; and (5) the canonical ETH signaling originated shortly after the emergence of angiosperms but before the split of monocots and eudicots (Fig. 3). Although different plant hormone signaling mechanisms appear to originate at different time points on the evolutionary history of plants, many signaling components are already encoded by the algal genomes.

AUX, JA, ETH, SL, and GA signaling pathways are all dependent on the SCF-mediated ubiquitin-proteasome system (Santner and Estelle, 2009; Jiang et al., 2013). Interestingly, AUX and JA signaling pathways share a common mechanism of hormone sensing and response. (1) Both JA and AUX receptors are F-box proteins. The COI1 and TIR1 proteins share highly similar crystal structures (TM score = 0.481). Our similarity search shows that the JA receptor (COI1) shares the highest sequence identity with the AUX receptors (TIR1/AFBs) within the Arabidopsis genome. Phylogenetic analysis of TIR1/AFBs and COI1-related proteins shows that JA and AUX receptors are the closest relatives and descend from gene duplication events deep in the last common ancestor of land plants (Fig. 6). (2) JA and AUX share a similar signal transduction logic (Santner and Estelle, 2009; Pauwels et al., 2010). Both hormones function as molecular glue between the receptors and the repressors. Both signaling pathways require the SCF-dependent degradation of the repressors (Santner and Estelle, 2009; Pauwels et al., 2010). The repressors, the AUX/IAA proteins of AUX signaling and the NINJA proteins of JA signaling, have an EAR motif that can bind the TPL and TPL-related proteins (Pauwels et al., 2010). (3) Conjugate jasmonoyl-Ile is the bioactive form of the hormone JA (Ludwig-Müller, 2011). The JAR1 protein is the only GH3 protein in Arabidopsis that can conjugate JA to Ile, while some other GH3 proteins conjugate AUX but not JA (Ludwig-Müller, 2011; Supplemental Fig. S1). The P. patens genome has two copies of the GH3 gene, PpGH3-1 and PpGH3-2, both of which can catalyze the conjugation of the hormones JA and AUX to amino acids and are thought to arise from a gene duplication of a progenitor gene synthesizing conjugates with JA and AUX (Ludwig-Müller, 2011). These facts and results suggest that JA and AUX signaling pathways are likely to originate from a single ancient hormone signaling and that the ancient hormone signaling evolved into AUX and JA signaling pathways when the ancient receptors (TIR1/AFBs and COI1 progenitor) duplicated and diverged to bind different hormones (Supplemental Fig. S2) and regulate different target genes. Further experiments are needed to determine whether this ancient hormone signaling is in response to AUX, JA, or both.

The phylogenetic relationship and domain architecture of JA and AUX receptors. The phylogeny is an ML tree reconstructed based on homologs of JA and AUX receptors. Bootstrap values (ML/neighbor-joining) are shown near the nodes. Domain architecture was mapped near the protein. Different domains are represented by rectangles with different colors. Protein structures of JA and AUX receptor orthologs are shown near the corresponding proteins.

Distinct from JA and AUX signaling pathways, the F-box proteins, SLY1/2 and D3, do not act as the receptors of GA and SL signaling pathways and are not closely related to AUX and JA receptors (Gagne et al., 2002). Surprisingly, the crystal structures of the GA receptor (GID1) of Arabidopsis and the SL receptor (D14) of rice are similar (TM score = 0.187), albeit no describable sequence similarity could be detected. It is possible that there is deep homology between GID1 and D14 proteins. Nevertheless, these two signaling pathways appear to be independently derived from the SCF-mediated ubiquitination pathway.

Both CK and ETH signaling pathways undergo multistep phosphorylation at the onset of hormone responses. All the CK and ETH receptors are His kinases and share three common domains, HATPase, HisKA, and receiver domains, resembling the components of TCS (Chang et al., 1993; Pils and Heyl, 2009; Schaller et al., 2011). Our phylogenetic analyses of these three domains reveal that CK and ETH receptors occupy different phylogenetic positions and do not form a monophyletic group (Supplemental Fig. S3). Besides the three TCS-related protein domains, CK and ETH receptors have their specific domains, CHASE and GAF, respectively. It follows that CK and ETH signaling pathways evolved from independent elaboration of the TCS. ETH signaling is also mediated by SCF-dependent ubiquitination (Mount and Chang, 2002). Therefore, ETH signaling seems to originate by recombining an ancient TCS and an SCF-mediated ubiquitin-proteasome pathway.

Some Signaling Components Underwent Gene Family Expansion in Vascular Plants

Some plant hormone signaling components seem to undergo significant gene family expansion in vascular plants. The following are some representative examples. (1) The number of PYR proteins (ABA receptors) increases in higher plants: the genomes of P. patens and S. moellendorffii encode two and five members, whereas the genomes of rice and Arabidopsis encode 10 and 14 members, respectively (Supplemental Fig. S1). (2) The ETP1/2-related proteins of the ETH signaling pathway arose shortly after the emergence of angiosperms. The rice genome encodes 11 ETP1/2-related proteins, while the Arabidopsis genome encodes 83 ETP1/2-related proteins (Supplemental Fig. S1). (3) The AUX/IAAs of AUX signaling arose in charophytes. While the genomes of P. patens and S. moellendorffii encode two and four AUX/IAAs, the genomes of Arabidopsis and rice encode 25 and 27 AUX/IAAs, respectively (Supplemental Fig. S1). These plant hormone signaling components seem to undergo a rapid gene family expansion during the evolutionary history of higher plants. Although the exact advantages of these gene family expansions are unclear, it is conceivable that they might make substantial contributions to the evolution and features specific to higher plants.

Molecular Tinkering Might Be an Important Mechanism for Plant Hormone Evolution

Domain rearrangement has been suggested to play a role in the evolution of the AUX hormone signaling proteins (Kalluri et al., 2007; Finet et al., 2013). For the proteins mediating plant hormone signal transduction or transport, a total of 58 different protein domains were identified. Among these protein domains, 18 appear to be plant specific, while the others seem to have a more ancient origin (Supplemental Table S2). Moreover, the same domain might be recruited in different hormone signaling pathways and even in different components of the same signaling pathway. For example, the Pkinase domain is embedded within MAP3K of ETH signaling, SnRK2s of ABA signaling, and BRI1, BAK1, CDG1, and BIN2 of BR signaling. The same domain might exist in different domain architectures: BSKs of BR signaling and the CTR1 of ETH signaling both have a Pkinase_Tyr domain, but this domain is combined with the TPR_11 domain in BSKs and combined with the EDR1 domain in CTR1; both ABFs of ABA signaling and TGAs of SA signaling have a bZIP_1 domain, but TGAs have an additional domain, DOG1. These examples indicate that domain shuffling did occur frequently during the origin and evolution of plant hormone signaling. We believe that evolutionary tinkering with the preexisting protein domains might represent an important evolutionary mechanism for plant hormone signaling evolution.

Implications in the Emergence of Land Plants

The emergence of land plants represents one of the most important innovations in the history of life. When transiting from an aquatic to a terrestrial environment, plants evolved from organisms with a simple body of a few cells to ones with complex organs and tissues and encountered new environmental stresses, such as drought, UV light, and pathogens (Kenrick and Crane, 1997). Little is known about the biological mechanisms underlying the morphological changes and the adaptation to land of plants. We found that three major hormone (ABA, JA, and SA) signaling pathways might have emerged while plants were colonizing the land. All of these hormone signaling pathways regulate defense against biotic and abiotic stresses: SA modulates plant immunity to pathogens; JAs control plant defense against wounding, herbivores, and certain pathogens; and ABA plays an important role in plant responses to environmental stresses, especially drought. Therefore, it is highly possible that these signaling pathways likely emerged and evolved into their current forms as a consequence of selection pressure exerted by the biotic or abiotic stresses encountered in terrestrial environments (Hauser et al., 2011). The innovation of these plant hormone signaling pathways might provide a potential mechanism for the emergence of land plants.

Caveats

Our analyses come with several caveats. First, the presence of hormone signaling component orthologs does not necessarily indicate that these orthologs would work together as genuine signaling pathways. For example, orthologs of all the core components of GA are present in the P. patens genome, but DELLA and GID1 do not interact with each other in the presence of GA (Hirano et al., 2007; Yasumura et al., 2007; Hayashi et al., 2010). Further experiments are needed to explore the interaction of these signaling component orthologs. Second, hormone biology has been studied mainly based on the genetic analysis of angiosperms, especially the model plant Arabidopsis. Hormone signaling might have a different/alternative form in plants outside the model plant Arabidopsis (such as the SA and ETH signaling pathways discussed above). Obviously, more work is needed to characterize plant hormone signaling pathways in lower plants and/or nonmodel plants, which will provide extremely important insights into their biology, origin, and evolution. Third, we used a combination of similarity search, phylogenetic reconstruction, and conserved domain analysis to identify orthologs (discussed above). Only homologs with the same domain architecture as the Arabidopsis proteins are considered as orthologs. This approach is a conservative one, since some domains might not play an important role in hormone signaling. However, these domains are functionally important; otherwise, they would not be conserved and identified as domains. Last, we used the transcriptome data of several charophytes and the liverwort M. polymorpha in this study. As discussed above, because gene expression is usually tissue specific, developmental stage specific, and/or stress induced, transcriptome data sometimes cannot cover all the genetic information of the genomes. Because the coverage of transcriptome data is sometimes low and the assembled sequences are relatively short, we cannot conclusively infer the presence/absence of the orthologs of hormone signaling components in these species. Further sequencing of the genomes of additional charophytes and M. polymorpha could help us understand the origin and evolution of plant hormone signaling pathways.

Sequence Similarity Search

To identify the homologs of the core proteins implicated in the transport and signal transduction of AUX, ABA, BR, CK, ETH, GA, JA, SA, and SL hormones, we performed similarity searches using the BLAST (BLASTP and TBLASTN) algorithms with the Arabidopsis proteins as queries and an E-value threshold of 10−5. To identify distant homologs, we performed further rounds of BLAST (BLASTP and TBLASTN) searches with the homologs identified in P. patens.

Phylogenetic Analysis

Protein domain sequence alignments were also retrieved from the Pfam database (Finn et al., 2014). Because most of the protein domain alignments are extremely large, we infer a phylogenetic relationship for each protein domain using FastTree 2 with the CAT model (Price et al., 2010). The phylogenetic support for each split was assessed using SH-like values (Price et al., 2010). To explore the detailed evolutionary history of each hormone signaling component, homologous sequences were retrieved from Phytozome version 9.1 (http://phytozome.jgi.doe.gov/pz/portal.html). The translated protein sequences were aligned using MUSCLE (Edgar, 2004) and then manually edited. Neighbor-joining phylogenetic trees were reconstructed using MEGA5 (Tamura et al., 2011). The phylogenetic support for each split was evaluated with 1,000 bootstrap replicates. Moreover, we also employed MEGA5 (Tamura et al., 2011) to reconstruct ML phylogenetic trees. The phylogenetic support for each split was evaluated with 100 bootstrap replicates.

Ortholog Identification

We used phylogenetic analysis to identify potential orthologs. Protein conserved domain analysis of each potential ortholog was determined using Pfam (Finn et al., 2014). Only those with the same conserved domain architecture as the Arabidopsis proteins were defined as orthologs.

Footnotes

C.W. and G.-Z.H. performed most of the experiments; Y.L. provided technical assistance; G.-Z.H. designed the experiments and analyzed the data; G.-Z.H. and S.-S.L. conceived the project; G.-Z.H. wrote the article with input from C.W.; S.-S.L. supervised the project and complemented the writing.

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Guan-Zhu Han (guanzhu{at}email.arizona.edu).

↵1 This work was supported by the National Key Technologies R&D Program of China (grant no. 2011BAD35B03) and the Priority Academic Program Development of Jiangsu Higher Education Institutions.