Bottom Line:
Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared.As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve.They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this.

ABSTRACTThe informational content of RNA sequencing is currently far from being completely explored. Most of the analyses focus on processing tables of counts or finding isoform deconvolution via exon junctions. This article presents a comparison of several techniques that can be used to estimate differential expression of exons or small genomic regions of expression, based on their coverage function shapes. The problem is defined as finding the differentially expressed exons between two samples using local expression profile normalization and statistical measures to spot the differences between two profile shapes. Initial experiments have been done using synthetic data, and real data modified with synthetically created differential patterns. Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared. As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve. These platform-independent techniques have been implemented in the Bioconductor package rnaSeqMap. They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this. The areas of application include significant difference searches, splicing identification algorithms and finding suitable regions for QPCR primers.

gkr1249-F7: Heatmap for all the pipelines run on 3000 real exons. Rows represents pipelines, columns represent exonic regions, the color depicts the log2 of the difference between two real samples given by the specific pipeline.

Mentions:
The results of testing all the pipelines on two samples of real data are presented as a heatmap (Figure 7). Although the pipelines have different ranges of the results on the log scale, they tend to agree on most of the regions. There is some 10% of the regions (right side of the heatmap) where the pipelines give highly spurious numeric results. In particular, the MHD1 and MHD2 tend to give results contrary to the other measures for this fraction of genomic regions.Figure 7.

gkr1249-F7: Heatmap for all the pipelines run on 3000 real exons. Rows represents pipelines, columns represent exonic regions, the color depicts the log2 of the difference between two real samples given by the specific pipeline.

Mentions:
The results of testing all the pipelines on two samples of real data are presented as a heatmap (Figure 7). Although the pipelines have different ranges of the results on the log scale, they tend to agree on most of the regions. There is some 10% of the regions (right side of the heatmap) where the pipelines give highly spurious numeric results. In particular, the MHD1 and MHD2 tend to give results contrary to the other measures for this fraction of genomic regions.Figure 7.

Bottom Line:
Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared.As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve.They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this.

ABSTRACTThe informational content of RNA sequencing is currently far from being completely explored. Most of the analyses focus on processing tables of counts or finding isoform deconvolution via exon junctions. This article presents a comparison of several techniques that can be used to estimate differential expression of exons or small genomic regions of expression, based on their coverage function shapes. The problem is defined as finding the differentially expressed exons between two samples using local expression profile normalization and statistical measures to spot the differences between two profile shapes. Initial experiments have been done using synthetic data, and real data modified with synthetically created differential patterns. Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared. As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve. These platform-independent techniques have been implemented in the Bioconductor package rnaSeqMap. They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this. The areas of application include significant difference searches, splicing identification algorithms and finding suitable regions for QPCR primers.