References & Citations

Bookmark

Quantitative Biology > Quantitative Methods

Title:Alignment Metric Accuracy

Abstract: We propose a metric for the space of multiple sequence alignments that can be
used to compare two alignments to each other. In the case where one of the
alignments is a reference alignment, the resulting accuracy measure improves
upon previous approaches, and provides a balanced assessment of the fidelity of
both matches and gaps. Furthermore, in the case where a reference alignment is
not available, we provide empirical evidence that the distance from an
alignment produced by one program to predicted alignments from other programs
can be used as a control for multiple alignment experiments. In particular, we
show that low accuracy alignments can be effectively identified and discarded.
We also show that in the case of pairwise sequence alignment, it is possible to
find an alignment that maximizes the expected value of our accuracy measure.
Unlike previous approaches based on expected accuracy alignment that tend to
maximize sensitivity at the expense of specificity, our method is able to
identify unalignable sequence, thereby increasing overall accuracy. In
addition, the algorithm allows for control of the sensitivity/specificity
tradeoff via the adjustment of a single parameter. These results are confirmed
with simulation studies that show that unalignable regions can be distinguished
from homologous, conserved sequences. Finally, we propose an extension of the
pairwise alignment method to multiple alignment. Our method, which we call
AMAP, outperforms existing protein sequence multiple alignment programs on
benchmark datasets. A webserver and software downloads are available at
this http URL .