research-article

Method-level Bug Prediction: Problems and Promises

Authors:

Shaiful Chowdhury,

Reid HolmesAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 33, Issue 4

Article No.: 98, Pages 1 - 31

https://doi.org/10.1145/3640331

Published: 18 April 2024 Publication History

Abstract

Fixing software bugs can be colossally expensive, especially if they are discovered in the later phases of the software development life cycle. As such, bug prediction has been a classic problem for the research community. As of now, the Google Scholar site generates ∼113,000 hits if searched with the “bug prediction” phrase. Despite this staggering effort by the research community, bug prediction research is criticized for not being decisively adopted in practice. A significant problem of the existing research is the granularity level (i.e., class/file level) at which bug prediction is historically studied. Practitioners find it difficult and time-consuming to locate bugs at the class/file level granularity. Consequently, method-level bug prediction has become popular in the past decade. We ask, are these method-level bug prediction models ready for industry use? Unfortunately, the answer is no. The reported high accuracies of these models dwindle significantly if we evaluate them in different realistic time-sensitive contexts. It may seem hopeless at first, but, encouragingly, we show that future method-level bug prediction can be improved significantly. In general, we show how to reliably evaluate future method-level bug prediction models and how to improve them by focusing on four different improvement avenues: building noise-free bug data, addressing concept drift, selecting similar training projects, and developing a mixture of models. Our findings are based on three publicly available method-level bug datasets and a newly built bug dataset of 774,051 Java methods originating from 49 open-source software projects.

References

[1]

Syed Ishtiaque Ahmad. 2021. Investigating the Impact of Methodological Choices on Source Code Maintenance Analyses. Master’s Thesis. University of British Columbia.

[2]

M. Alfadel, A. Kobilica, and J. Hassine. 2017. Evaluation of Halstead and cyclomatic complexity metrics in measuring defect density. In Proceedings of the 9th IEEE-GCC Conference and Exhibition. 1–9.

[3]

H. Alsolai, M. Roper, and D. Nassar. 2018. Predicting software maintainability in object-oriented systems using ensemble techniques. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. 716–721.

[4]

T. L. Alves, C. Ypma, and J. Visser. 2010. Deriving metric thresholds from benchmark data. In Proceedings of the IEEE International Conference on Software Maintenance. 1–10.

Digital Library

[5]

Francesca Arcelli Fontana, Vincenzo Ferme, Marco Zanoni, and Aiko Yamashita. 2015. Automatic metric thresholds derivation for code smell detection. In Proceedings of the IEEE/ACM 6th International Workshop on Emerging Trends in Software Metrics. 44–53.

[6]

Takuya Asano, Masateru Tsunoda, Koji Toda, Amjed Tahir, Kwabena Ebo Bennin, Keitaro Nakasai, Akito Monden, and Kenichi Matsumoto. 2021. Using bandit algorithms for project selection in cross-project defect prediction. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’21). 649–653.

[7]

Abdullateef O. Balogun, Babajide J. Odejide, Amos O. Bajeh, Zubair O. Alanamu, Fatima E. Usman-Hamza, Hammid O. Adeleke, Modinat A. Mabayoje, and Shakirat R. Yusuff. 2022. Empirical analysis of data sampling-based ensemble methods in software defect prediction. In Proceedings of the International Conference on Computational Science and Its Applications. 363–379.

Digital Library

[8]

Abdul Ali Bangash, Hareem Sahar, Abram Hindle, and Karim Ali. 2020. On the time-based conclusion stability of cross-project defect prediction models. Empir. Softw. Eng. 25, 6 (2020).

Digital Library

[9]

V. R. Basili, L. C. Briand, and W. L. Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22, 10 (1996), 751–761.

Digital Library

[10]

Gabriele Bavota, Mario Linares-Vásquez, Carlos Eduardo Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto, and Denys Poshyvanyk. 2015. The impact of API change- and fault-proneness on the user ratings of Android apps. IEEE Trans. Softw. Eng. 41, 4 (2015), 384–407.

Digital Library

[11]

Robert M. Bell, Thomas J. Ostrand, and Elaine J. Weyuker. 2011. Does measuring code change improve fault prediction? In Proceedings of the 7th International Conference on Predictive Models in Software Engineering (Promise’11).

Digital Library

[12]

Kwabena E. Bennin, Nauman bin Ali, Jürgen Börstler, and Xiao Yu. 2020. Revisiting the impact of concept drift on just-in-time quality assurance. In Proceedings of the IEEE 20th International Conference on Software Quality, Reliability and Security (QRS’20). 53–59.

[13]

Christian Bird, Adrian Bachmann, Eirik Aune, John Duffy, Abraham Bernstein, Vladimir Filkov, and Premkumar Devanbu. 2009. Fair and balanced? Bias in bug-fix datasets. 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering.121–130.

[14]

Raymond P. L. Buse and Westley R. Weimer. 2010. Learning a metric for code readability. IEEE Trans. Softw. Eng. 36, 4 (July2010), 546–558.

Digital Library

[15]

Celerity. The True Cost of a Software Bug: Part One. Retrieved from https://www.celerity.com/insights/the-true-cost-of-a-software-bug. (n.d.).

[16]

Ned Chapin. 2000. Do we know what preventive maintenance is? In Proceedings of the International Conference on Software Maintenance. 15–17.

[17]

Jiachi Chen, Xin Xia, David Lo, John Grundy, and Xiaohu Yang. 2021. Maintenance-related concerns for post-deployed Ethereum smart contract development: Issues, techniques, and future challenges. Empir. Softw. Eng. 26, 6 (2021), 1–44.

Digital Library

[18]

Yaohui Chen, Peng Li, Jun Xu, Shengjian Guo, Rundong Zhou, Yulong Zhang, Tao Wei, and Long Lu. 2020. Savior: Towards bug-driven hybrid testing. In Proceedings of the IEEE Symposium on Security and Privacy (SP’20). 1580–1596.

[19]

S. R. Chidamber and C. F. Kemerer. 1994. A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20, 6 (1994), 476–493.

Digital Library

[20]

Shaiful Chowdhury, Stephanie Borle, Stephen Romansky, and Abram Hindle. 2019. GreenScaler: Training software energy models with automatic test generation. Empir. Softw. Eng.: Int. J. 24, 4 (2019), 1649–1692.

Digital Library

[21]

Shaiful Chowdhury, Silvia Di Nardo, Abram Hindle, and Zhen Ming Jack Jiang. 2018. An exploratory study on assessing the energy impact of logging on Android applications. Empir. Softw. Eng. 23, 3 (2018), 1422–1456.

Digital Library

[22]

Shaiful Chowdhury, Reid Holmes, Andy Zaidman, and Rick Kazman. 2022. Revisiting the debate: Are code metrics useful for measuring maintenance effort? Empir. Softw. Eng. 27, 6 (2022), 31.

Digital Library

[23]

Shaiful Chowdhury, Gias Uddin, and Reid Holmes. 2022. An empirical study on maintainable method size in Java. In Proceedings of the International Conference on Mining Software Repositories (MSR’22). 252–264.

Digital Library

[24]

B. Curtis, S. B. Sheppard, P. Milliman, M. A. Borst, and T. Love. 1979. Measuring the psychological complexity of software maintenance tasks with the Halstead and McCabe metrics. IEEE Trans. Softw. Eng. SE-5, 2 (1979), 96–104.

Digital Library

[25]

Marco D’Ambros, Michele Lanza, and Romain Robbes. 2010. An extensive comparison of bug prediction approaches. In Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR’10). IEEE, 31–41.

[26]

Steven Davies, Marc Roper, and Murray Wood. 2014. Comparing text-based and dependence-based approaches for determining the origins of bugs. J. Softw.: Evolut. Process 26, 1 (2014), 107–139.

[27]

Dario Di Nucci, Fabio Palomba, Giuseppe De Rosa, Gabriele Bavota, Rocco Oliveto, and Andrea De Lucia. 2018. A developer centered bug prediction model. IEEE Trans. Softw. Eng. 44, 1 (2018), 5–24.

Digital Library

[28]

Martín Dias, Alberto Bacchelli, Georgios Gousios, Damien Cassou, and Stéphane Ducasse. 2015. Untangling fine-grained code changes. In Proceedings of the IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER’15). 341–350.

[29]

Jayalath Ekanayake, Jonas Tappolet, Harald C. Gall, and Abraham Bernstein. 2009. Tracking concept drift of software projects using defect prediction quality. In Proceedings of the 6th IEEE International Working Conference on Mining Software Repositories. 51–60.

Digital Library

[30]

K. El Emam, S. Benlarbi, N. Goel, and S. N. Rai. 2001. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Eng. 27, 7 (2001), 630–650.

Digital Library

[31]

Farzaneh S. Fard, Paul Hollensen, Stuart Mcilory, and Thomas Trappenberg. 2017. Impact of biased mislabeling on learning with deep networks. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’17). 2652–2657.

[32]

Norman E. Fenton and Martin Neil. 1999. A critique of software defect prediction models. IEEE Trans. Softw. Eng. 25, 5 (1999), 675–689.

Digital Library

[33]

Rudolf Ferenc, Péter Gyimesi, Gábor Gyimesi, Zoltán Tóth, and Tibor Gyimóthy. 2020. An automatically created novel bug dataset and its validation in bug prediction. J. Syst. Softw. 169 (2020).

[34]

Christine Fisher. 2020. Boeing Found Another Software Bug on the 737 Max. Retrieved from https://www.engadget.com/2020-02-06-boeing-737-max-software-bug.html

[35]

Emanuel Giger, Marco D’Ambros, Martin Pinzger, and Harald C. Gall. 2012. Method-level bug prediction. In Proceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. 171–180.

Digital Library

[36]

Yossi Gil and Gal Lalouche. 2016. When do software complexity metrics mean nothing? – When examined out of context. J. Object Technol. 15, 1 (Feb.2016), 2:1–25.

[37]

Yossi Gil and Gal Lalouche. 2017. On the correlation between size and metric validity. Empir. Softw. Eng. 22, 5 (Oct.2017), 2585–2611.

Digital Library

[38]

T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. 2000. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng. 26, 7 (2000), 653–661.

Digital Library

[39]

Felix Grund, Shaiful Chowdhury, Nick C. Bradley, Braxton Hall, and Reid Holmes. 2021. CodeShovel: A reusable and available tool for extracting source code histories. In Proceedings of the International Conference on Software Engineering (ICSE’21). 221–222.

Digital Library

[40]

Felix Grund, Shaiful Chowdhury, Nick C. Bradley, Braxton Hall, and Reid Holmes. 2021. CodeShovel: Constructing method-level source code histories. In Proceedings of the International Conference on Software Engineering (ICSE’21). 1510–1522.

Digital Library

[41]

Shivani Gupta and Atul Gupta. 2019. Dealing with noise problem in machine learning data-sets: A systematic review. Proced. Comput. Sci. 161 (2019), 466–474. In Proceedings of the5th Information Systems International Conference.

Digital Library

[42]

T. Gyimothy, R. Ferenc, and I. Siket. 2005. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31, 10 (2005), 897–910.

Digital Library

[43]

Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2011. Historage: Fine-grained version control system for Java. In Proceedings of the International Workshop on Principles of Software Evolution and ERCIM Workshop on Software Evolution. 96–100.

Digital Library

[44]

Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug prediction based on fine-grained module histories. 200–210.

[45]

Mark Hays and Jane Hayes. 2012. The effect of testability on fault proneness: A case study of the Apache HTTP server. In Proceedings of the IEEE 23rd International Symposium on Software Reliability Engineering Workshops. 153–158.

Digital Library

[46]

Peng He, Bing Li, Xiao Liu, Jun Chen, and Yutao Ma. 2015. An empirical study on software defect prediction with a simplified metric set. Inf. Softw. Technol. 59 (2015), 170–190.

Digital Library

[47]

Ilja Heitlager, Tobias Kuipers, and Joost Visser. 2007. A practical model for measuring maintainability. In Proceedings of the 6th International Conference on Quality of Information and Communications Technology. 30–39.

Digital Library

[48]

Steffen Herbold, Alexander Trautsch, and Jens Grabowski. 2017. Global vs. local models for cross-project defect prediction. Empir. Softw. Eng. 22, 4 (2017), 1866–1902.

Digital Library

[49]

Kim Herzig, Sascha Just, and Andreas Zeller. 2016. The impact of tangled code changes on defect prediction models. Empir. Softw. Eng. 21, 2 (2016), 303–336.

Digital Library

[50]

K. Herzig and A. Zeller. 2013. The impact of tangled code changes. In Proceedings of the 10th Working Conference on Mining Software Repositories. 121–130.

Digital Library

[51]

Yoshiki Higo, Shinpei Hayashi, and Shinji Kusumoto. 2020. On tracking Java methods with Git mechanisms. J. Syst. Softw. 165 (2020), 110571.

[52]

Seyedrebvar Hosseini, Burak Turhan, and Dimuthu Gunarathna. 2017. A systematic literature review and meta-analysis on cross project defect prediction. IEEE Trans. Softw. Eng. 45, 2 (2017), 111–147.

[53]

Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Vol. 112, Springer.

[54]

Jirayus Jiarpakdee, Chakkrit Kla Tantithamthavorn, and John Grundy. 2021. Practitioners’ perceptions of the goals and visual explanations of defect prediction models. In Proceedings of the IEEE/ACM 18th International Conference on Mining Software Repositories (MSR’21). 432–443.

[55]

Md Alamgir Kabir, Jacky W. Keung, Kwabena E. Bennin, and Miao Zhang. 2019. Assessing the significant impact of concept drift in software defect prediction. In Proceedings of the IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC’19), Vol. 1. 53–58.

[56]

D. Kafura and G. R. Reddy. 1987. The use of software complexity metrics in software maintenance. IEEE Trans. Softw. Eng. SE-13, 3 (1987), 335–343.

Digital Library

[57]

Yasutaka Kamei and Emad Shihab. 2016. Defect prediction: Accomplishments and future challenges. In Proceedings of the IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER’16). 33–45.

[58]

Sunghun Kim, Thomas Zimmermann, Kai Pan, and E. James Jr. Whitehead. 2006. Automatic identification of bug-introducing changes. In Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06). 81–90.

Digital Library

[59]

Hiroyuki Kirinuki, Yoshiki Higo, Keisuke Hotta, and Shinji Kusumoto. 2014. Hey! Are you committing tangled changes? In Proceedings of the 22nd International Conference on Program Comprehension. 262–265.

Digital Library

[60]

Barbara Kitchenham, Lech Madeyski, David Budgen, Jacky Keung, Pearl Brereton, Stuart Charters, Shirley Gibbs, and Amnart Pohthong. 2017. Robust statistical methods for empirical software engineering. Empir. Softw. Eng. 22, 2 (2017), 579–630.

Digital Library

[61]

D. Landman, A. Serebrenik, and J. Vinju. 2014. Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. 221–230.

Digital Library

[62]

Michele Lanza, Andrea Mocci, and Luca Ponzanelli. 2016. The tragedy of defect prediction, prince of empirical software engineering research. IEEE Softw. 33, 6 (2016), 102–105.

Digital Library

[63]

Issam H. Laradji, Mohammad Alshayeb, and Lahouari Ghouti. 2015. Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58 (2015), 388–402.

[64]

Vladimir I. Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, Vol. 10. 707–710.

[65]

C. Lewis and R. Ou. 2011. Bug Prediction at Google. Retrieved from http://google-engtools.blogspot.com/2011/12/bug-prediction-at-google.html

[66]

Zhiqiang Li, Xiao-Yuan Jing, and Xiaoke Zhu. 2018. Progress on approaches to software defect prediction. IET Softw. 12, 3 (2018), 161–175.

Digital Library

[67]

Xiaoyu Liu, LiGuo Huang, Chuanyi Li, and Vincent Ng. 2018. Linking source code to untangled change intents. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’18). 393–403.

[68]

Ying Ma, Guangchun Luo, Xue Zeng, and Aiguo Chen. 2012. Transfer learning for cross-company software defect prediction. Inf. Softw. Technol. 54, 3 (2012), 248–256.

Digital Library

[69]

Guillermo Macbeth, Eugenia Razumiejczyk, and Rubén Daniel Ledesma. 2011. Cliff’s delta calculator: A non-parametric effect size program for two groups of observations. Univers. Psychol. 10, 2 (2011), 545–555.

[70]

T. J. McCabe. 1976. A complexity measure. IEEE Trans. Softw. Eng. SE-2, 4 (1976), 308–320.

Digital Library

[71]

Patrick E. McKnight and Julius Najab. 2010. Mann-Whitney U test. Corsini Encyc. Psychol. 1, 1 (2010).

[72]

T. Menzies, J. Greenwald, and A. Frank. 2007. Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33, 1 (2007), 2–13.

[73]

Microsoft. 2022. Code Metrics Maintainability Index. Retrieved from https://docs.microsoft.com/en-us/visualstudio/code-quality/code-metrics-maintainability-index-range-and-meaning?view=vs-2022

[74]

Ran Mo, Shaozhi Wei, Qiong Feng, and Zengyang Li. 2022. An exploratory study of bug prediction at the method level. Inf. Softw. Technol. 144, C (Apr.2022).

[75]

Manishankar Mondal, Banani Roy, Chanchal K. Roy, and Kevin A. Schneider. 2019. Investigating the relationship between evolutionary coupling and software bug-proneness. In Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering. 173–182.

[76]

Raimund Moser, Witold Pedrycz, and Giancarlo Succi. 2008. Analysis of the reliability of a subset of change metrics for defect prediction. In Proceedings of the 2nd ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’08). 309–311.

Digital Library

[77]

N. Nagappan and T. Ball. 2005. Use of relative code churn measures to predict system defect density. In Proceedings of the 27th International Conference on Software Engineering. 284–292.

Digital Library

[78]

Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining metrics to predict component failures. In Proceedings of the 28th International Conference on Software Engineering. 452–461.

[79]

Mark E. J. Newman. 2005. Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46, 5 (2005), 323–351.

[80]

Steffen M. Olbrich, Daniela S. Cruzes, and Dag I. K. Sjøberg. 2010. Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. In Proceedings of the IEEE International Conference on Software Maintenance. 1–10.

Digital Library

[81]

P. Oman and J. Hagemeister. 1992. Metrics for assessing a software system’s maintainability. In Proceedings of the Conference on Software Maintenance. 337–344.

[82]

Fabio Palomba, Andy Zaidman, Rocco Oliveto, and Andrea De Lucia. 2017. An exploratory study on the relationship between changes and refactoring. In Proceedings of the 25th International Conference on Program Comprehension. 176–185.

Digital Library

[83]

Luca Pascarella, Fabio Palomba, and Alberto Bacchelli. 2020. On the performance of method-level bug prediction: A negative result. J. Syst. Softw. 161 (2020).

Digital Library

[84]

Fabiano Pecorelli, Gemma Catolino, Filomena Ferrucci, Andrea De Lucia, and Fabio Palomba. Testing of mobile applications in the wild: A large-scale empirical study on Android apps. In Proceedings of the 28th International Conference on Program Comprehension. 296–307.

[85]

Danijel Radjenovié, Marjan Heričko, Richard Torkar, and Aleš Živkovič. 2013. Software fault prediction metrics: A systematic literature review. Inf. Softw. Technol. 55, 8 (2013), 1397–1418.

Digital Library

[86]

Foyzur Rahman, Daryl Posnett, Abram Hindle, Earl Barr, and Premkumar Devanbu. 2011. BugCache for inspections: Hit or miss?19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. 322–331.

[87]

Md Saidur Rahman and Chanchal K. Roy. 2017. On the relationships between stability and bug-proneness of code clones: An empirical study. In Proceedings of the IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM’17). 131–140.

[88]

Paul Ralph and Ewan Tempero. 2018. Construct validity in software engineering research and software metrics. In Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering. 13–23.

Digital Library

[89]

Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On the “Naturalness” of buggy code. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). 428–439.

Digital Library

[90]

Nornadiah Mohd Razal and Bee Wah Yap. 2011. Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. J. Stat. Model. Analyt. 2, 1 (2011), 21–33.

[91]

Umaa Rebbapragada and Carla E. Brodley. 2007. Class noise mitigation through instance weighting. In Proceedings of the European Conference on Machine Learning. 708–715.

Digital Library

[92]

Payam Refaeilzadeh, Lei Tang, and Huan Liu. 2009. Cross-validation. Encyc. Datab. Syst. 5 (2009), 532–538.

[93]

D. Romano and M. Pinzger. 2011. Using source code metrics to predict change-prone Java interfaces. In Proceedings of the 27th IEEE International Conference on Software Maintenance. 303–312.

Digital Library

[94]

Giovanni Rosa, Luca Pascarella, Simone Scalabrino, Rosalia Tufano, Gabriele Bavota, Michele Lanza, and Rocco Oliveto. 2021. Evaluating SZZ implementations through a developer-informed oracle. In Proceedings of the 43rd International Conference on Software Engineering. 436–447.

Digital Library

[95]

S. Scalabrino, M. Linares-Vásquez, D. Poshyvanyk, and R. Oliveto. 2016. Improving code readability models with textual features. In Proceedings of the IEEE 24th International Conference on Program Comprehension. 1–10.

[96]

Matteson Scott. 2018. Report: Software Failure Caused $1.7 Trillion in Financial Losses in 2017. Retrieved from https://www.techrepublic.com/article/report-software-failure-caused-1-7-trillion-in-financial-losses-in-2017/

[97]

Francisco Servant and James A. Jones. 2017. Fuzzy fine-grained code-history analysis. In Proceedings of the International Conference on Software Engineering (ICSE’17). 746–757.

Digital Library

[98]

Martin Shepperd, Michelle Cartwright, and Gada Kadoda. 2000. On building prediction systems for software engineers. Empir. Softw. Eng. 5, 3 (2000), 175–182.

Digital Library

[99]

Emad Shihab, Ahmed E. Hassan, Bram Adams, and Zhen Ming Jiang. 2012. An industrial study on the risk of software changes. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering.

Digital Library

[100]

Y. Shin, A. Meneely, L. Williams, and J. A. Osborne. 2011. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans. Softw. Eng. 37, 6 (2011), 772–787.

Digital Library

[101]

Thomas Shippey, Tracy Hall, Steve Counsell, and David Bowes. 2016. So you need more method level datasets for your software defect prediction? Voilà! InProceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’16).

[102]

Shivkumar Shivaji, E. James Whitehead, Ram Akella, and Sunghun Kim. 2012. Reducing features to improve code change-based bug prediction. IEEE Trans. Softw. Eng. 39, 4 (2012), 552–569.

Digital Library

[103]

Jacek Śliwerski, Thomas Zimmermann, and Andreas Zeller. 2005. When do changes induce fixes? ACM SIGSOFT Softw. Eng. Notes 30, 4 (2005), 1–5.

Digital Library

[104]

Qinbao Song, Zihan Jia, Martin Shepperd, Shi Ying, and Jin Liu. 2011. A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37, 3 (2011), 356–370.

Digital Library

[105]

D. Spadini, F. Palomba, A. Zaidman, M. Bruntink, and A. Bacchelli. 2018. On the relation of test smells to software code quality. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. 1–12.

[106]

Daniela Steidl, Benjamin Hummel, and Elmar Juergens. 2014. Incremental origin analysis of source code files. In Proceedings of the Working Conference on Mining Software Repositories (MSR’14). 42–51.

Digital Library

[107]

Zhongbin Sun, Junqi Li, Heli Sun, and Liang He. 2021. CFPS: Collaborative filtering based source projects selection for cross-project defect prediction. Appl. Soft Comput. 99 (2021), 106940.

[108]

Zhongbin Sun, Qinbao Song, and Xiaoyan Zhu. 2012. Using coding-based ensemble learning to improve software defect prediction. IEEE Trans. Syst., Man, Cybern., Part C (Applic. Rev.) 42, 6 (2012), 1806–1817.

Digital Library

[109]

Chakkrit Tantithamthavorn and Ahmed E. Hassan. 2018. An experience report on defect modelling in practice: Pitfalls and challenges. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. 286–295.

Digital Library

[110]

Umesh Tiwari and Santosh Kumar. 2014. Cyclomatic complexity metric for component based software. SIGSOFT Softw. Eng. Notes 39, 1 (Feb.2014), 1–6.

Digital Library

[111]

VerifySoft. 2022. VerifySoft Maintainability Index. Retrieved from https://verifysoft.com/en_maintainability.html

[112]

Zhiyuan Wan, Xin Xia, Ahmed E. Hassan, David Lo, Jianwei Yin, and Xiaohu Yang. 2020. Perceptions, expectations, and challenges in defect prediction. IEEE Trans. Softw. Eng. 46, 11 (2020), 1241–1266.

[113]

Song Wang, Junjie Wang, Jaechang Nam, and Nachiappan Nagappan. 2021. Continuous software bug prediction. 15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. Article 14, 12 pages.

[114]

Tiejian Wang, Zhiwu Zhang, Xiaoyuan Jing, and Liqiang Zhang. 2016. Multiple kernel ensemble learning for software defect prediction. Autom. Softw. Eng. 23, 4 (2016), 569–590.

Digital Library

[115]

Zixu Wang, Weiyuan Tong, Peng Li, Guixin Ye, Hao Chen, Xiaoqing Gong, and Zhanyong Tang. 2022. BugPre: An intelligent software version-to-version bug prediction system using graph convolutional neural networks. Complex Intell. Syst. 9, 4 (2022), 1–21.

[116]

S. Wattanakriengkrai, P. Thongtanunam, C. Tantithamthavorn, H. Hata, and K. Matsumoto. 2022. Predicting defective lines using a model-agnostic technique. IEEE Trans. Softw. Eng. 48, 05 (May2022), 1480–1496.

Digital Library

[117]

Ian H. Witten, Eibe Frank, Mark A. Hall, and Christopher J. Pal. 2005. Practical machine learning tools and techniques. In Data Mining, Vol. 2. Elsevier.

[118]

Sebastien C. Wong, Adam Gatt, Victor Stamatescu, and Mark D. McDonnell. 2016. Understanding data augmentation for classification: When to warp? In Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA’16). 1–6.

[119]

Mahama Yahaya, Wenbo Fan, Chuanyun Fu, Xiang Li, Yue Su, and Xinguo Jiang. 2020. A machine-learning method for improving crash injury severity analysis: A case study of work zone crashes in Cairo, Egypt. Int. J. Injur. Contr. Safet. Promot. 27, 3 (2020), 266–275.

[120]

Xinli Yang, David Lo, Xin Xia, and Jianling Sun. 2017. TLEL: A two-layer ensemble learning approach for just-in-time defect prediction. Inf. Softw. Technol. 87 (2017), 206–220.

Digital Library

[121]

F. Zhang, A. Mockus, Y. Zou, F. Khomh, and A. E. Hassan. 2013. How does context affect the distribution of software maintainability metrics? In Proceedings of the IEEE International Conference on Software Maintenance. 350–359.

Digital Library

[122]

Yuming Zhou, Baowen Xu, and Hareton Leung. 2010. On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J. Syst. Softw. 83, 4 (2010), 660–674.

Digital Library

[123]

Xingquan Zhu, Xindong Wu, and Qijun Chen. 2003. Eliminating class noise in large datasets. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 920–927.

[124]

Thomas Zimmermann, Rahul Premraj, and Andreas Zeller. 2007. Predicting defects for eclipse. In Proceedings of the 3rd International Workshop on Predictor Models in Software Engineering.

Digital Library

Cited By

Hassan KMoradi SChowdhury SRouhani S(2024)Versioned Analysis of Software Quality Indicators and Self-admitted Technical Debt in Ethereum Smart Contracts with Ethstractor2024 IEEE International Conference on Blockchain (Blockchain)10.1109/Blockchain62396.2024.00075(512-519)Online publication date: 19-Aug-2024
https://doi.org/10.1109/Blockchain62396.2024.00075
Jász J(2024)The Effectiveness of Hidden Dependence Metrics in Bug PredictionIEEE Access10.1109/ACCESS.2024.340692912(77214-77225)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3406929
Mashhadi EChowdhury SModaberi SHemmati HUddin G(2024)An empirical study on bug severity estimation using source code metrics and static analysisJournal of Systems and Software10.1016/j.jss.2024.112179(112179)Online publication date: Aug-2024
https://doi.org/10.1016/j.jss.2024.112179

Index Terms

Method-level Bug Prediction: Problems and Promises
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Empirical software validation

Recommendations

Method-level bug prediction
ESEM '12: Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement

Researchers proposed a wide range of approaches to build effective bug prediction models that take into account multiple aspects of the software development process. Such models achieved good prediction performance, guiding developers towards those ...
An exploratory study of bug prediction at the method level
Abstract Context:
During the past decades, researchers have proposed numerous studies to predict bugs at different granularity levels, such as the file level, package level, module level, etc. However, the prediction models at the ...
Empirical Evaluation of Hunk Metrics as Bug Predictors
IWSM '09 /Mensura '09: Proceedings of the International Conferences on Software Process and Product Measurement

Reducing the number of bugs is a crucial issue during software development and maintenance. Software process and product metrics are good indicators of software complexity. These metrics have been used to build bug predictor models to help developers ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 33, Issue 4

May 2024

940 pages

EISSN:1557-7392

DOI:10.1145/3613665

Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2024

Online AM: 13 January 2024

Accepted: 19 December 2023

Revised: 25 September 2023

Received: 18 November 2022

Published in TOSEM Volume 33, Issue 4

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSERC Alliance
Alberta Innovates CASBE Program
Eyes High Postdoctoral Match-Funding Program

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
434
Total Downloads

Downloads (Last 12 months)434
Downloads (Last 6 weeks)42

Reflects downloads up to 06 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hassan KMoradi SChowdhury SRouhani S(2024)Versioned Analysis of Software Quality Indicators and Self-admitted Technical Debt in Ethereum Smart Contracts with Ethstractor2024 IEEE International Conference on Blockchain (Blockchain)10.1109/Blockchain62396.2024.00075(512-519)Online publication date: 19-Aug-2024
https://doi.org/10.1109/Blockchain62396.2024.00075
Jász J(2024)The Effectiveness of Hidden Dependence Metrics in Bug PredictionIEEE Access10.1109/ACCESS.2024.340692912(77214-77225)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3406929
Mashhadi EChowdhury SModaberi SHemmati HUddin G(2024)An empirical study on bug severity estimation using source code metrics and static analysisJournal of Systems and Software10.1016/j.jss.2024.112179(112179)Online publication date: Aug-2024
https://doi.org/10.1016/j.jss.2024.112179

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents