Predict Defects with Data Mining and Machine Learning

Quality assurance professionals have an arsenal of tried-and-true techniques for assessing and improving quality. Many of these revolve around the concept of risk. When quality professionals focus on risk, they generally focus on areas where defects would be the most damaging, rather than areas in which defects are most likely to be found. In recent years, the maturation of big data mining and predictive analysis tools have made it practical to predict where defects in an application are likely to reside. Stephen Frein describes his recent experiments with data mining and machine learning tools that can predict where defects are likely to appear. Learn how word clouds can point out the user stories most likely to harbor defects. Explore ways to identify and characterize your most defect-prone configuration items. Learn how modern analysis tools can reveal statistical patterns that are beyond the reach of human intuition and insight, and how these patterns can alert us to where defects may appear.

Stephen Frein is a director of quality assurance at Comcast, where his team is responsible for the quality of various high-profile web properties, including Comcast.com. As an adjunct professor at Drexel University, Stephen delivers soporific lectures on database development and IT management at both the undergraduate and graduate levels. For fifteen years he has been leading development and testing teams—occasionally well—mostly by dint of accidents he cannot reliably replicate.