Conference PapersHernandez-Lopez, Ana A.Alberti, C.Mattavelli, M.Toward a Dynamic Threshold for Quality-Score Distortion in Reference-Based AlignmentThe intrinsic high entropy metadata, known as quality scores, are largely the cause of the substantial size of sequence data files. Yet, there is no consensus on a viable reduction of the resolution of the quality score scale, arguably because of collateral side effects. In this paper we leverage on the penalty functions of HISAT2 aligner to rebin the quality score scale in such a way as to avoid any impact on sequence alignment, identifying alongside a distortion threshold. We tested our findings on whole-genome sequence and RNA sequence data, and contrasted the results with three methods for lossy distortion of the quality scores.Quality scores;
Reference-based alignment;
Quality score distortion;
HISAT2;
Lossy compression;
2019