Diagnosing New Faults Using Mutants and Prior Faults

Literature indicates that 20% of a program’s code is responsible for 80% of the faults, and 50-90% of the field failures are rediscoveries of previous faults. Despite this, identification of faulty code can consume 30-40% time of error correction. Previous fault-discovery techniques focusing on field failures either require many pass-fail traces, discover only crashing failures, or identify faulty “files” (which are of large granularity) as origin of the source code. In our earlier work (the F007 approach), we identify faulty “functions” (which are of small granularity) in a field trace by using earlier resolved traces of the same release, which limits it to the known faulty functions. This paper overcomes this limitation by proposing a new “strategy” to identify new and old faulty functions using F007. This strategy uses failed traces of mutants (artificial faults) and failed traces of prior releases to identify faulty functions in the traces of succeeding release. Our results on two UNIX utilities (i.e., Flex and Gzip) show that faulty functions in the traces of the majority (60-85%) of failures of a new software release can be identified by reviewing only 20% of the code. If compared against prior techniques then this is a notable improvement in terms of contextual knowledge required and accuracy in the discovery of finer-grain fault origin.