Abstract

Background

Divergence in gene structure following gene duplication is not well understood. Gene
duplication can occur via whole-genome duplication (WGD) and single-gene duplications
including tandem, proximal and transposed duplications. Different modes of gene duplication
may be associated with different types, levels, and patterns of structural divergence.

Results

In Arabidopsis thaliana, we denote levels of structural divergence between duplicated genes by differences
in coding-region lengths and average exon lengths, and the number of insertions/deletions
(indels) and maximum indel length in their protein sequence alignment. Among recent
duplicates of different modes, transposed duplicates diverge most dramatically in
gene structure. In transposed duplications, parental loci tend to have longer coding-regions
and exons, and smaller numbers of indels and maximum indel lengths than transposed
loci, reflecting biased structural changes in transposed duplications. Structural
divergence increases with evolutionary time for WGDs, but not transposed duplications,
possibly because of biased gene losses following transposed duplications. Structural
divergence has heterogeneous relationships with nucleotide substitution rates, but
is consistently positively correlated with gene expression divergence. The NBS-LRR
gene family shows higher-than-average levels of structural divergence.

Conclusions

Our study suggests that structural divergence between duplicated genes is greatly
affected by the mechanisms of gene duplication and may be not proportional to evolutionary
time, and that certain gene families are under selection on rapid evolution of gene
structure.