1
ESEM | October 9, 2008 On Establishing a Benchmark for Evaluating Static Analysis Prioritization and Classification Techniques Sarah Heckman and Laurie Williams Department of Computer Science North Carolina State University

4
ESEM | October 9, Research Objective Problem –Several false positive mitigation models have been proposed. –Difficult to compare and evaluate different models. Research Objective: to propose the FAULTBENCH benchmark to the software anomaly detection community for comparison and evaluation of false positive mitigation techniques.

12
ESEM | October 9, FAULTBENCH Process 1.For each subject program 1.Run static analysis on clean version of subject 2.Record original state of alert set 3.Prioritize or classify alerts with FP mitigation technique 2.Inspect each alert starting at top of prioritized list or by randomly selecting an alert predicted as actionable 1.If oracle says actionable, fix with specified code change. 2.If oracle says unactionable, suppress alert 3.After each inspection, record alert set state and rerun static analysis tool 4.Evaluate results via evaluation metrics.

13
ESEM | October 9, Case Study Process 1.Open subject program in Eclipse Run FindBugs on clean version of subject 2.Record original state of alert set 3.Prioritize alerts with a version of AWARE-APM 2.Inspect each alert starting at top of prioritized list 1.If oracle say actionable, fix with specified code change. 2.If oracle says unactionable, suppress alert 3.After each inspection, record alert set state. FindBugs should run automatically. 4.Evaluate results via evaluation metrics.

20
ESEM | October 9, FAULTBENCH Limitations Alert oracles chosen from 3 rd party inspection of source code, not developers. Generation of optimal ordering biased to the tool ordering of alerts. Subjects written in Java, so may not generalize to FP mitigation techniques for other languages.

21
ESEM | October 9, Future Work Collaborate with other researchers to evolve FAULTBENCH Use FAULTBENCH to compare FP mitigation techniques from literature