skip to main content
article

A systematic and comprehensive investigation of methods to build and evaluate fault prediction models

Published: 01 January 2010 Publication History

Abstract

This paper describes a study performed in an industrial setting that attempts to build predictive models to identify parts of a Java system with a high fault probability. The system under consideration is constantly evolving as several releases a year are shipped to customers. Developers usually have limited resources for their testing and would like to devote extra resources to faulty system parts. The main research focus of this paper is to systematically assess three aspects on how to build and evaluate fault-proneness models in the context of this large Java legacy system development project: (1) compare many data mining and machine learning techniques to build fault-proneness models, (2) assess the impact of using different metric sets such as source code structural measures and change/fault history (process measures), and (3) compare several alternative ways of assessing the performance of the models, in terms of (i) confusion matrix criteria such as accuracy and precision/recall, (ii) ranking ability, using the receiver operating characteristic area (ROC), and (iii) our proposed cost-effectiveness measure (CE). The results of the study indicate that the choice of fault-proneness modeling technique has limited impact on the resulting classification accuracy or cost-effectiveness. There is however large differences between the individual metric sets in terms of cost-effectiveness, and although the process measures are among the most expensive ones to collect, including them as candidate measures significantly improves the prediction models compared with models that only include structural measures and/or their deltas between releases - both in terms of ROC area and in terms of CE. Further, we observe that what is considered the best model is highly dependent on the criteria that are used to evaluate and compare the models. And the regular confusion matrix criteria, although popular, are not clearly related to the problem at hand, namely the cost-effectiveness of using fault-proneness prediction models to focus verification efforts to deliver software with less faults at less cost.

References

[1]
Arisholm, E., Briand, L.C., Fuglerud, M., 2007. Data mining techniques for building fault-proneness models in Telecom Java Software. In: The 18th IEEE International Symposium on Software Reliability, 2007. ISSRE '07, pp. 215-224.
[2]
Arisholm, E., Briand, L.C., Johannessen, E., 2008. Data mining techniques, candidate measures and evaluation methods for building practically useful fault-proneness prediction models. Simula Research Laboratory Technical Report 2008-06.
[3]
Empirical studies of quality models in object-oriented systems. Advances in Computers. v59. 97-166.
[4]
Empirical studies of quality models in object-oriented systems. Advances in Computers. v56. 97-166.
[5]
Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Transactions on Software Engineering. v28. 706-720.
[6]
Statistical Power Analysis for the Behavioral Sciences. second ed. Lawrence Erlbaum Associates.
[7]
Evaluating the effectiveness of reliability-assurance techniques. Journal of Systems and Software. v9. 191-195.
[8]
Denaro, G., Pezze, M., 2002. An empirical evaluation of fault-proneness models. In: 24th International Conference on Software Engineering, pp. 241-251.
[9]
The confounding effect of class size on the validity of object-oriented metrics. IEEE Transactions on Software Engineering. v27. 630-650.
[10]
Predicting defect-prone software modules using support vector machines. Journal of Systems and Software. v81. 649-660.
[11]
Multivariate adaptive regression splines. Annals of Statistics. v19. 1-67.
[12]
Fuglerud, M.J., 2007. Implementing and evaluating a fault-proneness prediction model to focus testing in a Telecom Java Legacy System. M.Sc. Thesis, Dept. of Informatics, University of Oslo.
[13]
Applying machine learning to software fault-proneness prediction. Journal of Systems and Software. v81. 186-195.
[14]
Predicting fault incidence using software change history. IEEE Transactions on Software Engineering. v26. 653-661.
[15]
Guo, L., Ma, Y., Cukic, B., Singh, H., 2004. Robust prediction of fault-proneness by random forests. In: 15th International Symposium on Software Reliability Engineering, pp. 417-428.
[16]
Hall, M., 2000. Correlation-based feature selection for discrete and numeric class machine learning. In: 17th International Conference on Machine Learning, pp. 359-366.
[17]
The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. v143. 29-36.
[18]
JHawk, http://www.virtualmachinery.com/jhawkprod.htm.
[19]
Object-oriented software fault prediction using neural networks. Information and Software Technology. v49. 483-492.
[20]
Modeling software quality with classification trees. In: Pham, H. (Ed.), Recent Advances in Reliability and Quality Engineering, vol. 2. World Scientific Publishing, Singapore. pp. 247-270.
[21]
Count models for software quality estimation. IEEE Transactions on Reliability. v56. 212-222.
[22]
Analogy-based practical classification rules for software quality estimation. Empirical Software Engineering. v8. 325-350.
[23]
Comparative assessment of software quality classification techniques: an empirical case study. Empirical Software Engineering. v9. 229-257.
[24]
Classifying software changes: clean or buggy?. IEEE Transactions on Software Engineering. v34. 181-196.
[25]
NASA Metrics Data Program Repository, <http://mdp.ivv.nasa.gov/>.
[26]
Nagappan, N., Ball, T., 2005. Use of relative code churn measures to predict system defect density. In: 27th International Conference on Software Engineering, St. Louis, MO, USA.
[27]
Nagappan, N., Ball, T., 2007. Using software dependencies and churn metrics to predict field failures: an empirical case study. In: First International Symposium on Empirical Software Engineering and Measurement, pp. 364-373.
[28]
Ostrand, T.J., Weyuker, E.J., 2007. How to measure success of fault prediction models. In: Fourth International Workshop on Software Quality Assurance (SOQUA), ACM, Dubrovnik, Croatia.
[29]
Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering. v31. 340-355.
[30]
Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering. v31. 340-355.
[31]
Ostrand, T.J., Weyuker, E.J., Bell, R.M., 2007. Automating algorithms for the identification of fault-prone files. In: 2007 International Symposium on Software Testing and Analysis, London, United Kingdom.
[32]
Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Transactions on Software Engineering. v33. 675-686.
[33]
Mining software repositories for comprehensible software fault prediction models. Journal of Systems and Software. v81. 823-839.
[34]
Weyuker, E.J., Ostrand, T.J., Bell, R.M., 2007. Using developer information as a factor for fault prediction. In: International Workshop on Predictor Models in Software Engineering.
[35]
Data Mining: Practical Machine Learning Tools and Techniques. second ed. Morgan Kaufman.

Cited By

View all
  • (2024)Improving classifier-based effort-aware software defect prediction by reducing ranking errorsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661195(160-169)Online publication date: 18-Jun-2024
  • (2024)Risky Dynamic Typing-related Practices in Python: An Empirical StudyACM Transactions on Software Engineering and Methodology10.1145/364959333:6(1-35)Online publication date: 27-Jun-2024
  • (2024)Machine Learning-based Models for Predicting Defective PackagesProceedings of the 2024 8th International Conference on Machine Learning and Soft Computing10.1145/3647750.3647755(25-31)Online publication date: 26-Jan-2024
  • Show More Cited By

Index Terms

  1. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    Publisher

    Elsevier Science Inc.

    United States

    Publication History

    Published: 01 January 2010

    Author Tags

    1. Cost-effectiveness
    2. Fault prediction models
    3. Verification

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Improving classifier-based effort-aware software defect prediction by reducing ranking errorsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661195(160-169)Online publication date: 18-Jun-2024
    • (2024)Risky Dynamic Typing-related Practices in Python: An Empirical StudyACM Transactions on Software Engineering and Methodology10.1145/364959333:6(1-35)Online publication date: 27-Jun-2024
    • (2024)Machine Learning-based Models for Predicting Defective PackagesProceedings of the 2024 8th International Conference on Machine Learning and Soft Computing10.1145/3647750.3647755(25-31)Online publication date: 26-Jan-2024
    • (2024)Fine-Grained Just-In-Time Defect Prediction at the Block Level in Infrastructure-as-Code (IaC)Proceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644934(100-112)Online publication date: 15-Apr-2024
    • (2024)Enhancing Performance Bug Prediction Using Performance Code MetricsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644920(50-62)Online publication date: 15-Apr-2024
    • (2024)JIT-Smart: A Multi-task Learning Framework for Just-in-Time Defect Prediction and LocalizationProceedings of the ACM on Software Engineering10.1145/36437271:FSE(1-23)Online publication date: 12-Jul-2024
    • (2024)Prevalence and severity of design anti-patterns in open source programs—A large-scale studyInformation and Software Technology10.1016/j.infsof.2024.107429170:COnline publication date: 1-Jun-2024
    • (2024)The untold impact of learning approaches on software fault-proneness predictions: an analysis of temporal aspectsEmpirical Software Engineering10.1007/s10664-024-10454-829:4Online publication date: 8-Jun-2024
    • (2023)Decision Tree Regression Analysis of Proposed Metric Suite for Software Fault PredictionSN Computer Science10.1007/s42979-023-02386-95:1Online publication date: 6-Dec-2023
    • (2023)An empirical evaluation of defect prediction approaches in within-project and cross-project contextSoftware Quality Journal10.1007/s11219-023-09615-731:3(917-946)Online publication date: 4-Mar-2023
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media