Bottom Line:
Centromeric DNA sequences alone are neither necessary nor sufficient for centromere specification.Since the N terminus is subject to diversifying selection but the DNA binding domains do not appear to be, rapidly evolving centromere sequences are unlikely to be the primary driver of CenH3 sequence diversification.At present, the functional explanation for the diversity generated by both conventional protein evolution in the N terminal domain, as well as alternative splicing, remains unexplained.

Background: Centromeric DNA sequences alone are neither necessary nor sufficient for centromere specification. The centromere specific histone, CenH3, evolves rapidly in many species, perhaps as a coevolutionary response to rapidly evolving centromeric DNA. To gain insight into CenH3 evolution, we characterized patterns of nucleotide and protein diversity among diploids and allopolyploids within three diverse angiosperm genera, Brassica, Oryza, and Gossypium (cotton), with a focus on evidence for diversifying selection in the various domains of the CenH3 gene. In addition, we compare expression profiles and alternative splicing patterns for CenH3 in representatives of each genus.

Conclusions: Since the N terminus is subject to diversifying selection but the DNA binding domains do not appear to be, rapidly evolving centromere sequences are unlikely to be the primary driver of CenH3 sequence diversification. At present, the functional explanation for the diversity generated by both conventional protein evolution in the N terminal domain, as well as alternative splicing, remains unexplained.

Fig4: Homoeolog expression ofCenH3inGossypiumspecies. AD1-AD5 denote G. hirsutum, G. barbadense, G. tomentosum, G. mustelinum, and G. darwinii, respectively. The L and B suffixes denote leaf and bud tissue, respectively. A2 vs D5 is a comparison of the total level of expression in the model diploid progenitor species, G. arboreum (A) and G. raimondii (D). A2xD5F1 is an F1 hybrid between G. arboreum and G. raimondii. 2_A2D1 is a colchicine-doubled F1 hybrid between G. arboreum (A) and G. thurberi (D). AD1_maxxa and AD1_yuc are domesticated and wild accessions of G. hirsutum, respectively. Only partial data were generated for AD1B and AD4B. Standard deviations are represented by the error bars. A single asterisk represents samples that were statistically significant below 0.05, while two asterisks represent significant below 0.01.

Mentions:
The three methods to analyze expression resulted in differing degrees of homoeolog bias (where AT and DT are used to denote the two homoeologs), which was moderate in the RNA-seq data, and more extreme in the other methods (Figure 4). AT homoeolog, expression was favored in every species, tissue, and test (Figure 4). With RNA-seq data we compared the total expression levels of the model progenitor diploids (A2 vs D5), a synthetic polyploid (2(A2D1)), and wild and domesticated accessions of AD1 (yucatanense and Maxxa, respectively), all of which lacked a significant difference in expression. The only sample with a significant expression bias was the F1(A2D5), biased at 87.5% (P ≤ 0.05). The difference in homoeolog expression was not significantly different (T test) between leaf and bud tissue, except that AT homoeolog expression was significantly higher in leaves for G. barbadense (P = 1.1024 × 10−10).Figure 4

Fig4: Homoeolog expression ofCenH3inGossypiumspecies. AD1-AD5 denote G. hirsutum, G. barbadense, G. tomentosum, G. mustelinum, and G. darwinii, respectively. The L and B suffixes denote leaf and bud tissue, respectively. A2 vs D5 is a comparison of the total level of expression in the model diploid progenitor species, G. arboreum (A) and G. raimondii (D). A2xD5F1 is an F1 hybrid between G. arboreum and G. raimondii. 2_A2D1 is a colchicine-doubled F1 hybrid between G. arboreum (A) and G. thurberi (D). AD1_maxxa and AD1_yuc are domesticated and wild accessions of G. hirsutum, respectively. Only partial data were generated for AD1B and AD4B. Standard deviations are represented by the error bars. A single asterisk represents samples that were statistically significant below 0.05, while two asterisks represent significant below 0.01.

Mentions:
The three methods to analyze expression resulted in differing degrees of homoeolog bias (where AT and DT are used to denote the two homoeologs), which was moderate in the RNA-seq data, and more extreme in the other methods (Figure 4). AT homoeolog, expression was favored in every species, tissue, and test (Figure 4). With RNA-seq data we compared the total expression levels of the model progenitor diploids (A2 vs D5), a synthetic polyploid (2(A2D1)), and wild and domesticated accessions of AD1 (yucatanense and Maxxa, respectively), all of which lacked a significant difference in expression. The only sample with a significant expression bias was the F1(A2D5), biased at 87.5% (P ≤ 0.05). The difference in homoeolog expression was not significantly different (T test) between leaf and bud tissue, except that AT homoeolog expression was significantly higher in leaves for G. barbadense (P = 1.1024 × 10−10).Figure 4

Bottom Line:
Centromeric DNA sequences alone are neither necessary nor sufficient for centromere specification.Since the N terminus is subject to diversifying selection but the DNA binding domains do not appear to be, rapidly evolving centromere sequences are unlikely to be the primary driver of CenH3 sequence diversification.At present, the functional explanation for the diversity generated by both conventional protein evolution in the N terminal domain, as well as alternative splicing, remains unexplained.

Background: Centromeric DNA sequences alone are neither necessary nor sufficient for centromere specification. The centromere specific histone, CenH3, evolves rapidly in many species, perhaps as a coevolutionary response to rapidly evolving centromeric DNA. To gain insight into CenH3 evolution, we characterized patterns of nucleotide and protein diversity among diploids and allopolyploids within three diverse angiosperm genera, Brassica, Oryza, and Gossypium (cotton), with a focus on evidence for diversifying selection in the various domains of the CenH3 gene. In addition, we compare expression profiles and alternative splicing patterns for CenH3 in representatives of each genus.

Conclusions: Since the N terminus is subject to diversifying selection but the DNA binding domains do not appear to be, rapidly evolving centromere sequences are unlikely to be the primary driver of CenH3 sequence diversification. At present, the functional explanation for the diversity generated by both conventional protein evolution in the N terminal domain, as well as alternative splicing, remains unexplained.