Coverage Is Not Strongly Correlated With Test Suite Effectiveness

"In general this paper provides valuable empirical evidence where it was missing." --Reviewer 1

"Overall, I found this paper to be enjoyable to read. It is well written, and the research/results are well justified." --Reviewer 2

"If you manage to address [the reviewers'] points, your paper will become classic reading in the field of test coverage." --PB summary

Abstract

The coverage of a test suite is often used as a proxy for
its ability to detect faults. However, previous studies
that investigated the correlation between code coverage
and test suite effectiveness have failed to reach a
consensus about the nature and strength of the
relationship between these test suite characteristics.
Moreover, many of the studies were done with small or
synthetic programs, making it unclear whether their
results generalize to larger programs, and some of the
studies did not account for the confounding influence of
test suite size.

We have extended these studies by evaluating the
relationship between test suite size, coverage, and
effectiveness for large Java programs. Our study is the
largest to date in the literature: we generated 31,000
test suites for five systems consisting of up to 724,000
lines of source code. We then measured the statement
coverage, decision coverage, and modified condition
coverage of these suites and used mutation testing to
evaluate their fault detection effectiveness.

We found that there is a low to moderate correlation
between coverage and effectiveness when the number of
tests in the suite is controlled for. In addition, we
found that stronger forms of coverage do not provide
greater insight into the effectiveness of the suite. Our
results suggest that coverage, while useful for
identifying under-tested parts of a program, should not be
used as a quality target because it is not a good
indicator of test suite effectiveness.

Supplementary Material

BibTeX

@inproceedings{IH14,
author={Inozemtseva, Laura and Holmes, Reid},
title={Coverage is Not Strongly Correlated with Test Suite Effectiveness},
booktitle={Proceedings of the International Conference on Software Engineering},
year={2014},
}