Abstract

Background

Sesame, Sesamum indicum L., is considered the queen of oilseeds for its high oil content and quality, and
is grown widely in tropical and subtropical areas as an important source of oil and
protein. However, the molecular biology of sesame is largely unexplored.

Results

Here, we report a high-quality genome sequence of sesame assembled de novo with a contig N50 of 52.2 kb and a scaffold N50 of 2.1 Mb, containing an estimated
27,148 genes. The results reveal novel, independent whole genome duplication and the
absence of the Toll/interleukin-1 receptor domain in resistance genes. Candidate genes
and oil biosynthetic pathways contributing to high oil content were discovered by
comparative genomic and transcriptomic analyses. These revealed the expansion of type
1 lipid transfer genes by tandem duplication, the contraction of lipid degradation
genes, and the differential expression of essential genes in the triacylglycerol biosynthesis
pathway, particularly in the early stage of seed development. Resequencing data in
29 sesame accessions from 12 countries suggested that the high genetic diversity of
lipid-related genes might be associated with the wide variation in oil content. Additionally,
the results shed light on the pivotal stage of seed development, oil accumulation
and potential key genes for sesamin production, an important pharmacological constituent
of sesame.

Conclusions

As an important species from the order Lamiales and a high oil crop, the sesame genome
will facilitate future research on the evolution of eudicots, as well as the study
of lipid biosynthesis and potential genetic improvement of sesame.