Bug Prediction with Neural Nets using Regression- and Classification-based Approaches

Abstract

Bugs can often be hard to identify and developers spend a large amount of time locating and fixing them. Bug prediction strives to identify code defects using machine learning and statistical analysis and therefore decreasing time spent on bug localization. With bug prediction, awareness of bugs can be increased and software quality can be improved in significantly less time. Machine learning models are used in most modern bug prediction tools and with recent advances in machine learning, new models and possibilities have arisen that further improve the possibilities and performance in bug prediction. In our studies, we test the performance of “Doc2Vec — a current model that is used to vectorize plain text — on source code to perform classification. Instead of relying on code metrics, we analyze and vectorize plain-text source code and try to identify bugs based on similarity to learned paragraph vectors. Testing two different implementations of the Doc2Vec model, we find that no usable results can be achieved by using plain text classification models for bug prediction. Even after abstracting the code and applying parameter tuning on our model, all experiments deliver a constant 50% accuracy, so no learning can be achieved by any of the models. The experiments clearly show that code should not be treated as plain text and should instead contain more code-specific information like metrics about the code. Our second setup of experiments consists of a 3-layer feed forward neural network that performs classification- and regression- based approaches on code metrics, using datasets that contain a discrete number of bugs as a response variable. There have already been many successful experiments using metrics to perform classification based on code metrics. In our studies we compare the performance of a standard regression and standard multi-class classification model to the models “classification by regression (CbR) and “regression by classification” (RbC). In the RbC model, we use the output from classification to predict an accurate number of bugs and then calculate the root-mean-square error (RMSE). In the CbR model we use the output of regression to perform binary classification and calculate area under the receiver operating characteristic curve (ROC AUC) to compare the results. In our experiments we find that a neural network delivers better results when using the CbR model on an estimated defect count compared to the results using standard multi-class classification. We also suggest that the RMSE can significantly be decreased by using the RbC model compared to standard regression.