skip to main content
research-article

Method-level Bug Prediction: Problems and Promises

Published: 18 April 2024 Publication History

Abstract

Fixing software bugs can be colossally expensive, especially if they are discovered in the later phases of the software development life cycle. As such, bug prediction has been a classic problem for the research community. As of now, the Google Scholar site generates ∼113,000 hits if searched with the “bug prediction” phrase. Despite this staggering effort by the research community, bug prediction research is criticized for not being decisively adopted in practice. A significant problem of the existing research is the granularity level (i.e., class/file level) at which bug prediction is historically studied. Practitioners find it difficult and time-consuming to locate bugs at the class/file level granularity. Consequently, method-level bug prediction has become popular in the past decade. We ask, are these method-level bug prediction models ready for industry use? Unfortunately, the answer is no. The reported high accuracies of these models dwindle significantly if we evaluate them in different realistic time-sensitive contexts. It may seem hopeless at first, but, encouragingly, we show that future method-level bug prediction can be improved significantly. In general, we show how to reliably evaluate future method-level bug prediction models and how to improve them by focusing on four different improvement avenues: building noise-free bug data, addressing concept drift, selecting similar training projects, and developing a mixture of models. Our findings are based on three publicly available method-level bug datasets and a newly built bug dataset of 774,051 Java methods originating from 49 open-source software projects.

References

[1]
Syed Ishtiaque Ahmad. 2021. Investigating the Impact of Methodological Choices on Source Code Maintenance Analyses. Master’s Thesis. University of British Columbia.
[2]
M. Alfadel, A. Kobilica, and J. Hassine. 2017. Evaluation of Halstead and cyclomatic complexity metrics in measuring defect density. In Proceedings of the 9th IEEE-GCC Conference and Exhibition. 1–9.
[3]
H. Alsolai, M. Roper, and D. Nassar. 2018. Predicting software maintainability in object-oriented systems using ensemble techniques. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. 716–721.
[4]
T. L. Alves, C. Ypma, and J. Visser. 2010. Deriving metric thresholds from benchmark data. In Proceedings of the IEEE International Conference on Software Maintenance. 1–10.
[5]
Francesca Arcelli Fontana, Vincenzo Ferme, Marco Zanoni, and Aiko Yamashita. 2015. Automatic metric thresholds derivation for code smell detection. In Proceedings of the IEEE/ACM 6th International Workshop on Emerging Trends in Software Metrics. 44–53.
[6]
Takuya Asano, Masateru Tsunoda, Koji Toda, Amjed Tahir, Kwabena Ebo Bennin, Keitaro Nakasai, Akito Monden, and Kenichi Matsumoto. 2021. Using bandit algorithms for project selection in cross-project defect prediction. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’21). 649–653.
[7]
Abdullateef O. Balogun, Babajide J. Odejide, Amos O. Bajeh, Zubair O. Alanamu, Fatima E. Usman-Hamza, Hammid O. Adeleke, Modinat A. Mabayoje, and Shakirat R. Yusuff. 2022. Empirical analysis of data sampling-based ensemble methods in software defect prediction. In Proceedings of the International Conference on Computational Science and Its Applications. 363–379.
[8]
Abdul Ali Bangash, Hareem Sahar, Abram Hindle, and Karim Ali. 2020. On the time-based conclusion stability of cross-project defect prediction models. Empir. Softw. Eng. 25, 6 (2020).
[9]
V. R. Basili, L. C. Briand, and W. L. Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22, 10 (1996), 751–761.
[10]
Gabriele Bavota, Mario Linares-Vásquez, Carlos Eduardo Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto, and Denys Poshyvanyk. 2015. The impact of API change- and fault-proneness on the user ratings of Android apps. IEEE Trans. Softw. Eng. 41, 4 (2015), 384–407.
[11]
Robert M. Bell, Thomas J. Ostrand, and Elaine J. Weyuker. 2011. Does measuring code change improve fault prediction? In Proceedings of the 7th International Conference on Predictive Models in Software Engineering (Promise’11).
[12]
Kwabena E. Bennin, Nauman bin Ali, Jürgen Börstler, and Xiao Yu. 2020. Revisiting the impact of concept drift on just-in-time quality assurance. In Proceedings of the IEEE 20th International Conference on Software Quality, Reliability and Security (QRS’20). 53–59.
[13]
Christian Bird, Adrian Bachmann, Eirik Aune, John Duffy, Abraham Bernstein, Vladimir Filkov, and Premkumar Devanbu. 2009. Fair and balanced? Bias in bug-fix datasets. 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering.121–130.
[14]
Raymond P. L. Buse and Westley R. Weimer. 2010. Learning a metric for code readability. IEEE Trans. Softw. Eng. 36, 4 (July2010), 546–558.
[15]
Celerity. The True Cost of a Software Bug: Part One. Retrieved from https://www.celerity.com/insights/the-true-cost-of-a-software-bug. (n.d.).
[16]
Ned Chapin. 2000. Do we know what preventive maintenance is? In Proceedings of the International Conference on Software Maintenance. 15–17.
[17]
Jiachi Chen, Xin Xia, David Lo, John Grundy, and Xiaohu Yang. 2021. Maintenance-related concerns for post-deployed Ethereum smart contract development: Issues, techniques, and future challenges. Empir. Softw. Eng. 26, 6 (2021), 1–44.
[18]
Yaohui Chen, Peng Li, Jun Xu, Shengjian Guo, Rundong Zhou, Yulong Zhang, Tao Wei, and Long Lu. 2020. Savior: Towards bug-driven hybrid testing. In Proceedings of the IEEE Symposium on Security and Privacy (SP’20). 1580–1596.
[19]
S. R. Chidamber and C. F. Kemerer. 1994. A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20, 6 (1994), 476–493.
[20]
Shaiful Chowdhury, Stephanie Borle, Stephen Romansky, and Abram Hindle. 2019. GreenScaler: Training software energy models with automatic test generation. Empir. Softw. Eng.: Int. J. 24, 4 (2019), 1649–1692.
[21]
Shaiful Chowdhury, Silvia Di Nardo, Abram Hindle, and Zhen Ming Jack Jiang. 2018. An exploratory study on assessing the energy impact of logging on Android applications. Empir. Softw. Eng. 23, 3 (2018), 1422–1456.
[22]
Shaiful Chowdhury, Reid Holmes, Andy Zaidman, and Rick Kazman. 2022. Revisiting the debate: Are code metrics useful for measuring maintenance effort? Empir. Softw. Eng. 27, 6 (2022), 31.
[23]
Shaiful Chowdhury, Gias Uddin, and Reid Holmes. 2022. An empirical study on maintainable method size in Java. In Proceedings of the International Conference on Mining Software Repositories (MSR’22). 252–264.
[24]
B. Curtis, S. B. Sheppard, P. Milliman, M. A. Borst, and T. Love. 1979. Measuring the psychological complexity of software maintenance tasks with the Halstead and McCabe metrics. IEEE Trans. Softw. Eng. SE-5, 2 (1979), 96–104.
[25]
Marco D’Ambros, Michele Lanza, and Romain Robbes. 2010. An extensive comparison of bug prediction approaches. In Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR’10). IEEE, 31–41.
[26]
Steven Davies, Marc Roper, and Murray Wood. 2014. Comparing text-based and dependence-based approaches for determining the origins of bugs. J. Softw.: Evolut. Process 26, 1 (2014), 107–139.
[27]
Dario Di Nucci, Fabio Palomba, Giuseppe De Rosa, Gabriele Bavota, Rocco Oliveto, and Andrea De Lucia. 2018. A developer centered bug prediction model. IEEE Trans. Softw. Eng. 44, 1 (2018), 5–24.
[28]
Martín Dias, Alberto Bacchelli, Georgios Gousios, Damien Cassou, and Stéphane Ducasse. 2015. Untangling fine-grained code changes. In Proceedings of the IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER’15). 341–350.
[29]
Jayalath Ekanayake, Jonas Tappolet, Harald C. Gall, and Abraham Bernstein. 2009. Tracking concept drift of software projects using defect prediction quality. In Proceedings of the 6th IEEE International Working Conference on Mining Software Repositories. 51–60.
[30]
K. El Emam, S. Benlarbi, N. Goel, and S. N. Rai. 2001. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Eng. 27, 7 (2001), 630–650.
[31]
Farzaneh S. Fard, Paul Hollensen, Stuart Mcilory, and Thomas Trappenberg. 2017. Impact of biased mislabeling on learning with deep networks. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’17). 2652–2657.
[32]
Norman E. Fenton and Martin Neil. 1999. A critique of software defect prediction models. IEEE Trans. Softw. Eng. 25, 5 (1999), 675–689.
[33]
Rudolf Ferenc, Péter Gyimesi, Gábor Gyimesi, Zoltán Tóth, and Tibor Gyimóthy. 2020. An automatically created novel bug dataset and its validation in bug prediction. J. Syst. Softw. 169 (2020).
[34]
Christine Fisher. 2020. Boeing Found Another Software Bug on the 737 Max. Retrieved from https://www.engadget.com/2020-02-06-boeing-737-max-software-bug.html
[35]
Emanuel Giger, Marco D’Ambros, Martin Pinzger, and Harald C. Gall. 2012. Method-level bug prediction. In Proceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. 171–180.
[36]
Yossi Gil and Gal Lalouche. 2016. When do software complexity metrics mean nothing? – When examined out of context. J. Object Technol. 15, 1 (Feb.2016), 2:1–25.
[37]
Yossi Gil and Gal Lalouche. 2017. On the correlation between size and metric validity. Empir. Softw. Eng. 22, 5 (Oct.2017), 2585–2611.
[38]
T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. 2000. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng. 26, 7 (2000), 653–661.
[39]
Felix Grund, Shaiful Chowdhury, Nick C. Bradley, Braxton Hall, and Reid Holmes. 2021. CodeShovel: A reusable and available tool for extracting source code histories. In Proceedings of the International Conference on Software Engineering (ICSE’21). 221–222.
[40]
Felix Grund, Shaiful Chowdhury, Nick C. Bradley, Braxton Hall, and Reid Holmes. 2021. CodeShovel: Constructing method-level source code histories. In Proceedings of the International Conference on Software Engineering (ICSE’21). 1510–1522.
[41]
Shivani Gupta and Atul Gupta. 2019. Dealing with noise problem in machine learning data-sets: A systematic review. Proced. Comput. Sci. 161 (2019), 466–474. In Proceedings of the5th Information Systems International Conference.
[42]
T. Gyimothy, R. Ferenc, and I. Siket. 2005. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31, 10 (2005), 897–910.
[43]
Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2011. Historage: Fine-grained version control system for Java. In Proceedings of the International Workshop on Principles of Software Evolution and ERCIM Workshop on Software Evolution. 96–100.
[44]
Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug prediction based on fine-grained module histories. 200–210.
[45]
Mark Hays and Jane Hayes. 2012. The effect of testability on fault proneness: A case study of the Apache HTTP server. In Proceedings of the IEEE 23rd International Symposium on Software Reliability Engineering Workshops. 153–158.
[46]
Peng He, Bing Li, Xiao Liu, Jun Chen, and Yutao Ma. 2015. An empirical study on software defect prediction with a simplified metric set. Inf. Softw. Technol. 59 (2015), 170–190.
[47]
Ilja Heitlager, Tobias Kuipers, and Joost Visser. 2007. A practical model for measuring maintainability. In Proceedings of the 6th International Conference on Quality of Information and Communications Technology. 30–39.
[48]
Steffen Herbold, Alexander Trautsch, and Jens Grabowski. 2017. Global vs. local models for cross-project defect prediction. Empir. Softw. Eng. 22, 4 (2017), 1866–1902.
[49]
Kim Herzig, Sascha Just, and Andreas Zeller. 2016. The impact of tangled code changes on defect prediction models. Empir. Softw. Eng. 21, 2 (2016), 303–336.
[50]
K. Herzig and A. Zeller. 2013. The impact of tangled code changes. In Proceedings of the 10th Working Conference on Mining Software Repositories. 121–130.
[51]
Yoshiki Higo, Shinpei Hayashi, and Shinji Kusumoto. 2020. On tracking Java methods with Git mechanisms. J. Syst. Softw. 165 (2020), 110571.
[52]
Seyedrebvar Hosseini, Burak Turhan, and Dimuthu Gunarathna. 2017. A systematic literature review and meta-analysis on cross project defect prediction. IEEE Trans. Softw. Eng. 45, 2 (2017), 111–147.
[53]
Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Vol. 112, Springer.
[54]
Jirayus Jiarpakdee, Chakkrit Kla Tantithamthavorn, and John Grundy. 2021. Practitioners’ perceptions of the goals and visual explanations of defect prediction models. In Proceedings of the IEEE/ACM 18th International Conference on Mining Software Repositories (MSR’21). 432–443.
[55]
Md Alamgir Kabir, Jacky W. Keung, Kwabena E. Bennin, and Miao Zhang. 2019. Assessing the significant impact of concept drift in software defect prediction. In Proceedings of the IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC’19), Vol. 1. 53–58.
[56]
D. Kafura and G. R. Reddy. 1987. The use of software complexity metrics in software maintenance. IEEE Trans. Softw. Eng. SE-13, 3 (1987), 335–343.
[57]
Yasutaka Kamei and Emad Shihab. 2016. Defect prediction: Accomplishments and future challenges. In Proceedings of the IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER’16). 33–45.
[58]
Sunghun Kim, Thomas Zimmermann, Kai Pan, and E. James Jr. Whitehead. 2006. Automatic identification of bug-introducing changes. In Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06). 81–90.
[59]
Hiroyuki Kirinuki, Yoshiki Higo, Keisuke Hotta, and Shinji Kusumoto. 2014. Hey! Are you committing tangled changes? In Proceedings of the 22nd International Conference on Program Comprehension. 262–265.
[60]
Barbara Kitchenham, Lech Madeyski, David Budgen, Jacky Keung, Pearl Brereton, Stuart Charters, Shirley Gibbs, and Amnart Pohthong. 2017. Robust statistical methods for empirical software engineering. Empir. Softw. Eng. 22, 2 (2017), 579–630.
[61]
D. Landman, A. Serebrenik, and J. Vinju. 2014. Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. 221–230.
[62]
Michele Lanza, Andrea Mocci, and Luca Ponzanelli. 2016. The tragedy of defect prediction, prince of empirical software engineering research. IEEE Softw. 33, 6 (2016), 102–105.
[63]
Issam H. Laradji, Mohammad Alshayeb, and Lahouari Ghouti. 2015. Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58 (2015), 388–402.
[64]
Vladimir I. Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, Vol. 10. 707–710.
[65]
C. Lewis and R. Ou. 2011. Bug Prediction at Google. Retrieved from http://google-engtools.blogspot.com/2011/12/bug-prediction-at-google.html
[66]
Zhiqiang Li, Xiao-Yuan Jing, and Xiaoke Zhu. 2018. Progress on approaches to software defect prediction. IET Softw. 12, 3 (2018), 161–175.
[67]
Xiaoyu Liu, LiGuo Huang, Chuanyi Li, and Vincent Ng. 2018. Linking source code to untangled change intents. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’18). 393–403.
[68]
Ying Ma, Guangchun Luo, Xue Zeng, and Aiguo Chen. 2012. Transfer learning for cross-company software defect prediction. Inf. Softw. Technol. 54, 3 (2012), 248–256.
[69]
Guillermo Macbeth, Eugenia Razumiejczyk, and Rubén Daniel Ledesma. 2011. Cliff’s delta calculator: A non-parametric effect size program for two groups of observations. Univers. Psychol. 10, 2 (2011), 545–555.
[70]
T. J. McCabe. 1976. A complexity measure. IEEE Trans. Softw. Eng. SE-2, 4 (1976), 308–320.
[71]
Patrick E. McKnight and Julius Najab. 2010. Mann-Whitney U test. Corsini Encyc. Psychol. 1, 1 (2010).
[72]
T. Menzies, J. Greenwald, and A. Frank. 2007. Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33, 1 (2007), 2–13.
[74]
Ran Mo, Shaozhi Wei, Qiong Feng, and Zengyang Li. 2022. An exploratory study of bug prediction at the method level. Inf. Softw. Technol. 144, C (Apr.2022).
[75]
Manishankar Mondal, Banani Roy, Chanchal K. Roy, and Kevin A. Schneider. 2019. Investigating the relationship between evolutionary coupling and software bug-proneness. In Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering. 173–182.
[76]
Raimund Moser, Witold Pedrycz, and Giancarlo Succi. 2008. Analysis of the reliability of a subset of change metrics for defect prediction. In Proceedings of the 2nd ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’08). 309–311.
[77]
N. Nagappan and T. Ball. 2005. Use of relative code churn measures to predict system defect density. In Proceedings of the 27th International Conference on Software Engineering. 284–292.
[78]
Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining metrics to predict component failures. In Proceedings of the 28th International Conference on Software Engineering. 452–461.
[79]
Mark E. J. Newman. 2005. Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46, 5 (2005), 323–351.
[80]
Steffen M. Olbrich, Daniela S. Cruzes, and Dag I. K. Sjøberg. 2010. Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. In Proceedings of the IEEE International Conference on Software Maintenance. 1–10.
[81]
P. Oman and J. Hagemeister. 1992. Metrics for assessing a software system’s maintainability. In Proceedings of the Conference on Software Maintenance. 337–344.
[82]
Fabio Palomba, Andy Zaidman, Rocco Oliveto, and Andrea De Lucia. 2017. An exploratory study on the relationship between changes and refactoring. In Proceedings of the 25th International Conference on Program Comprehension. 176–185.
[83]
Luca Pascarella, Fabio Palomba, and Alberto Bacchelli. 2020. On the performance of method-level bug prediction: A negative result. J. Syst. Softw. 161 (2020).
[84]
Fabiano Pecorelli, Gemma Catolino, Filomena Ferrucci, Andrea De Lucia, and Fabio Palomba. Testing of mobile applications in the wild: A large-scale empirical study on Android apps. In Proceedings of the 28th International Conference on Program Comprehension. 296–307.
[85]
Danijel Radjenovié, Marjan Heričko, Richard Torkar, and Aleš Živkovič. 2013. Software fault prediction metrics: A systematic literature review. Inf. Softw. Technol. 55, 8 (2013), 1397–1418.
[86]
Foyzur Rahman, Daryl Posnett, Abram Hindle, Earl Barr, and Premkumar Devanbu. 2011. BugCache for inspections: Hit or miss?19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. 322–331.
[87]
Md Saidur Rahman and Chanchal K. Roy. 2017. On the relationships between stability and bug-proneness of code clones: An empirical study. In Proceedings of the IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM’17). 131–140.
[88]
Paul Ralph and Ewan Tempero. 2018. Construct validity in software engineering research and software metrics. In Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering. 13–23.
[89]
Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On the “Naturalness” of buggy code. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). 428–439.
[90]
Nornadiah Mohd Razal and Bee Wah Yap. 2011. Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. J. Stat. Model. Analyt. 2, 1 (2011), 21–33.
[91]
Umaa Rebbapragada and Carla E. Brodley. 2007. Class noise mitigation through instance weighting. In Proceedings of the European Conference on Machine Learning. 708–715.
[92]
Payam Refaeilzadeh, Lei Tang, and Huan Liu. 2009. Cross-validation. Encyc. Datab. Syst. 5 (2009), 532–538.
[93]
D. Romano and M. Pinzger. 2011. Using source code metrics to predict change-prone Java interfaces. In Proceedings of the 27th IEEE International Conference on Software Maintenance. 303–312.
[94]
Giovanni Rosa, Luca Pascarella, Simone Scalabrino, Rosalia Tufano, Gabriele Bavota, Michele Lanza, and Rocco Oliveto. 2021. Evaluating SZZ implementations through a developer-informed oracle. In Proceedings of the 43rd International Conference on Software Engineering. 436–447.
[95]
S. Scalabrino, M. Linares-Vásquez, D. Poshyvanyk, and R. Oliveto. 2016. Improving code readability models with textual features. In Proceedings of the IEEE 24th International Conference on Program Comprehension. 1–10.
[96]
Matteson Scott. 2018. Report: Software Failure Caused $1.7 Trillion in Financial Losses in 2017. Retrieved from https://www.techrepublic.com/article/report-software-failure-caused-1-7-trillion-in-financial-losses-in-2017/
[97]
Francisco Servant and James A. Jones. 2017. Fuzzy fine-grained code-history analysis. In Proceedings of the International Conference on Software Engineering (ICSE’17). 746–757.
[98]
Martin Shepperd, Michelle Cartwright, and Gada Kadoda. 2000. On building prediction systems for software engineers. Empir. Softw. Eng. 5, 3 (2000), 175–182.
[99]
Emad Shihab, Ahmed E. Hassan, Bram Adams, and Zhen Ming Jiang. 2012. An industrial study on the risk of software changes. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering.
[100]
Y. Shin, A. Meneely, L. Williams, and J. A. Osborne. 2011. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans. Softw. Eng. 37, 6 (2011), 772–787.
[101]
Thomas Shippey, Tracy Hall, Steve Counsell, and David Bowes. 2016. So you need more method level datasets for your software defect prediction? Voilà! InProceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’16).
[102]
Shivkumar Shivaji, E. James Whitehead, Ram Akella, and Sunghun Kim. 2012. Reducing features to improve code change-based bug prediction. IEEE Trans. Softw. Eng. 39, 4 (2012), 552–569.
[103]
Jacek Śliwerski, Thomas Zimmermann, and Andreas Zeller. 2005. When do changes induce fixes? ACM SIGSOFT Softw. Eng. Notes 30, 4 (2005), 1–5.
[104]
Qinbao Song, Zihan Jia, Martin Shepperd, Shi Ying, and Jin Liu. 2011. A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37, 3 (2011), 356–370.
[105]
D. Spadini, F. Palomba, A. Zaidman, M. Bruntink, and A. Bacchelli. 2018. On the relation of test smells to software code quality. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. 1–12.
[106]
Daniela Steidl, Benjamin Hummel, and Elmar Juergens. 2014. Incremental origin analysis of source code files. In Proceedings of the Working Conference on Mining Software Repositories (MSR’14). 42–51.
[107]
Zhongbin Sun, Junqi Li, Heli Sun, and Liang He. 2021. CFPS: Collaborative filtering based source projects selection for cross-project defect prediction. Appl. Soft Comput. 99 (2021), 106940.
[108]
Zhongbin Sun, Qinbao Song, and Xiaoyan Zhu. 2012. Using coding-based ensemble learning to improve software defect prediction. IEEE Trans. Syst., Man, Cybern., Part C (Applic. Rev.) 42, 6 (2012), 1806–1817.
[109]
Chakkrit Tantithamthavorn and Ahmed E. Hassan. 2018. An experience report on defect modelling in practice: Pitfalls and challenges. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. 286–295.
[110]
Umesh Tiwari and Santosh Kumar. 2014. Cyclomatic complexity metric for component based software. SIGSOFT Softw. Eng. Notes 39, 1 (Feb.2014), 1–6.
[111]
VerifySoft. 2022. VerifySoft Maintainability Index. Retrieved from https://verifysoft.com/en_maintainability.html
[112]
Zhiyuan Wan, Xin Xia, Ahmed E. Hassan, David Lo, Jianwei Yin, and Xiaohu Yang. 2020. Perceptions, expectations, and challenges in defect prediction. IEEE Trans. Softw. Eng. 46, 11 (2020), 1241–1266.
[113]
Song Wang, Junjie Wang, Jaechang Nam, and Nachiappan Nagappan. 2021. Continuous software bug prediction. 15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. Article 14, 12 pages.
[114]
Tiejian Wang, Zhiwu Zhang, Xiaoyuan Jing, and Liqiang Zhang. 2016. Multiple kernel ensemble learning for software defect prediction. Autom. Softw. Eng. 23, 4 (2016), 569–590.
[115]
Zixu Wang, Weiyuan Tong, Peng Li, Guixin Ye, Hao Chen, Xiaoqing Gong, and Zhanyong Tang. 2022. BugPre: An intelligent software version-to-version bug prediction system using graph convolutional neural networks. Complex Intell. Syst. 9, 4 (2022), 1–21.
[116]
S. Wattanakriengkrai, P. Thongtanunam, C. Tantithamthavorn, H. Hata, and K. Matsumoto. 2022. Predicting defective lines using a model-agnostic technique. IEEE Trans. Softw. Eng. 48, 05 (May2022), 1480–1496.
[117]
Ian H. Witten, Eibe Frank, Mark A. Hall, and Christopher J. Pal. 2005. Practical machine learning tools and techniques. In Data Mining, Vol. 2. Elsevier.
[118]
Sebastien C. Wong, Adam Gatt, Victor Stamatescu, and Mark D. McDonnell. 2016. Understanding data augmentation for classification: When to warp? In Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA’16). 1–6.
[119]
Mahama Yahaya, Wenbo Fan, Chuanyun Fu, Xiang Li, Yue Su, and Xinguo Jiang. 2020. A machine-learning method for improving crash injury severity analysis: A case study of work zone crashes in Cairo, Egypt. Int. J. Injur. Contr. Safet. Promot. 27, 3 (2020), 266–275.
[120]
Xinli Yang, David Lo, Xin Xia, and Jianling Sun. 2017. TLEL: A two-layer ensemble learning approach for just-in-time defect prediction. Inf. Softw. Technol. 87 (2017), 206–220.
[121]
F. Zhang, A. Mockus, Y. Zou, F. Khomh, and A. E. Hassan. 2013. How does context affect the distribution of software maintainability metrics? In Proceedings of the IEEE International Conference on Software Maintenance. 350–359.
[122]
Yuming Zhou, Baowen Xu, and Hareton Leung. 2010. On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J. Syst. Softw. 83, 4 (2010), 660–674.
[123]
Xingquan Zhu, Xindong Wu, and Qijun Chen. 2003. Eliminating class noise in large datasets. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 920–927.
[124]
Thomas Zimmermann, Rahul Premraj, and Andreas Zeller. 2007. Predicting defects for eclipse. In Proceedings of the 3rd International Workshop on Predictor Models in Software Engineering.

Cited By

View all
  • (2024)Versioned Analysis of Software Quality Indicators and Self-admitted Technical Debt in Ethereum Smart Contracts with Ethstractor2024 IEEE International Conference on Blockchain (Blockchain)10.1109/Blockchain62396.2024.00075(512-519)Online publication date: 19-Aug-2024
  • (2024)The Effectiveness of Hidden Dependence Metrics in Bug PredictionIEEE Access10.1109/ACCESS.2024.340692912(77214-77225)Online publication date: 2024
  • (2024)An empirical study on bug severity estimation using source code metrics and static analysisJournal of Systems and Software10.1016/j.jss.2024.112179(112179)Online publication date: Aug-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 33, Issue 4
May 2024
940 pages
EISSN:1557-7392
DOI:10.1145/3613665
  • Editor:
  • Mauro Pezzè
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2024
Online AM: 13 January 2024
Accepted: 19 December 2023
Revised: 25 September 2023
Received: 18 November 2022
Published in TOSEM Volume 33, Issue 4

Check for updates

Author Tags

  1. Method-level bug prediction
  2. code metrics
  3. maintenance
  4. McCabe
  5. code complexity

Qualifiers

  • Research-article

Funding Sources

  • NSERC Alliance
  • Alberta Innovates CASBE Program
  • Eyes High Postdoctoral Match-Funding Program

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)434
  • Downloads (Last 6 weeks)42
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Versioned Analysis of Software Quality Indicators and Self-admitted Technical Debt in Ethereum Smart Contracts with Ethstractor2024 IEEE International Conference on Blockchain (Blockchain)10.1109/Blockchain62396.2024.00075(512-519)Online publication date: 19-Aug-2024
  • (2024)The Effectiveness of Hidden Dependence Metrics in Bug PredictionIEEE Access10.1109/ACCESS.2024.340692912(77214-77225)Online publication date: 2024
  • (2024)An empirical study on bug severity estimation using source code metrics and static analysisJournal of Systems and Software10.1016/j.jss.2024.112179(112179)Online publication date: Aug-2024

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media