Abstract

Background

While many genome sequences are complete, transcriptomes are less well characterized.
We used both genome-scale tiling arrays and massively parallel sequencing to map the
Caenorhabditis elegans transcriptome across development. We utilized this framework to identify transcriptome
changes in animals lacking the nonsense-mediated decay (NMD) pathway.

Results

We find that while the majority of detectable transcripts map to known gene structures,
>5% of transcribed regions fall outside current gene annotations. We show that >40%
of these are novel exons. Using both technologies to assess isoform complexity, we
estimate that >17% of genes change isoform across development. Next we examined how
the transcriptome is perturbed in animals lacking NMD. NMD prevents expression of
truncated proteins by degrading transcripts containing premature termination codons.
We find that approximately 20% of genes produce transcripts that appear to be NMD
targets. While most of these arise from splicing errors, NMD targets are enriched
for transcripts containing open reading frames upstream of the predicted translational
start (uORFs). We identify a relationship between the Kozak consensus surrounding
the true start codon and the degree to which uORF-containing transcripts are targeted
by NMD and speculate that translational efficiency may be coupled to transcript turnover
via the NMD pathway for some transcripts.

Conclusions

We generated a high-resolution transcriptome map for C. elegans and used it to identify endogenous targets of NMD. We find that these transcripts
arise principally through splicing errors, strengthening the prevailing view that
splicing and NMD are highly interlinked processes.