research-article

Cost-effective build outcome prediction using cascaded classifiers

Authors:

Ansong Ni,

Ming LiAuthors Info & Claims

MSR '17: Proceedings of the 14th International Conference on Mining Software Repositories

Pages 455 - 458

https://doi.org/10.1109/MSR.2017.26

Published: 20 May 2017 Publication History

Get Access

Abstract

Software developers use continuous integration to find defects in the early stage and reduce risk. But this process can be resource and time consuming, which decreases the efficiency of development. In this work, we adopt cascaded classifiers to predict the build outcome and study what kinds of attributes are potentially useful for this process. We emphasize on the "failed" instances which bring more cost. Our experiments reveal that our approach outperforms other commonly used classifiers. It reduces 51.7% of the waiting time and server workload while identifying 85.2% of the defective builds.

References

[1]

P. M. Duvall, Continuous integration. Pearson Education India, 2007.

Google Scholar

[2]

M. Beller, G. Gousios, and A. Zaidman. Oops, my tests broke the build: An analysis of travis ci builds with github. In No. e1984v1. PeerJ Preprints, 2016.

Crossref

Google Scholar

[3]

A. E. Hassan, and K. Zhang. Using decision trees to predict the certification result of a build. In 21st IEEE/ACM International Conference on Automated Software Engineering, 2006.

Digital Library

Google Scholar

[4]

E. A. Santos, and A. Hindle. Judging a commit by its cover: correlating commit message entropy with build status on travis-CI. In Proceedings of the 13th working conference on mining software repositories, 2016.

Digital Library

Google Scholar

[5]

J. Finlay, R. Pears, and A. M. Connor. Data stream mining for predicting software build outcomes using source code metrics. In Information and Software Technology 56.2 (2014): 183--198.

Digital Library

Google Scholar

[6]

P. A. Viola, and M. J. Jones. Robust Real-Time Face Detection. In International Journal of Computer Vision 57.2 (2004): 137--154

Digital Library

Google Scholar

[7]

M. Beller, G. Gousios, and A. Zaidman. TravisTorrent: Synthesizing Travis CI and GitHub for Full-Stack Research on Continuous Integration. In Proceedings of the 14th working conference on mining software repositories, 2017.

Digital Library

Google Scholar

[8]

S. Kim, T. Zimmermann, and E. J. Whitehead Jr. Predicting faults from cached history. In Proceedings of the 29th international conference on Software Engineering, 2007.

Digital Library

Google Scholar

[9]

Y. Freund, and R. E. Schapire. A desicion-theoretic generalization of online learning and an application to boosting. In European conference on computational learning theory, 1995.

Digital Library

Google Scholar

[10]

I. H. Witten, and E. Frank. Data Mining: Practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, 1999.

Digital Library

Google Scholar

[11]

N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian network classifiers. In Machine learning 29.2--3 (1997): 131--163.

Digital Library

Google Scholar

[12]

J. R. Quinlan. C4. 5: programs for machine learning. Morgan Kaufmann, 1993.

Digital Library

Google Scholar

Cited By

View all

Hong YTantithamthavorn CPasuksmit JThongtanunam PFriedman AZhao XKrasikov Ad'Amorim M(2024)Practitioners’ Challenges and Perceptions of CI Build Failure Predictions at AtlassianCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663856(370-381)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663856
Wang GSun ZChen YZhao YLiang QHao DChristakis MPradel M(2024)Commit Artifact Preserving Build PredictionProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680356(1236-1248)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680356
Kola-Olawuyi AWeeraddana NNagappan MSpinellis DConstantinou EBacchelli A(2024)The Impact of Code Ownership of DevOps Artefacts on the Outcome of DevOps CI BuildsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644924(543-555)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644924
Show More Cited By

Cost-effective build outcome prediction using cascaded classifiers
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Cost-sensitive probability for weighted voting in an ensemble model for multi-class classification problems
Abstract
Ensemble learning is an algorithm that utilizes various types of classification models. This algorithm can enhance the prediction efficiency of component models. However, the efficiency of combining models typically depends on the diversity and ...
Software defect prediction using tree-based ensembles
PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering

Software defect prediction is an active research area in software engineering. Accurate prediction of software defects assists software engineers in guiding software quality assurance activities. In machine learning, ensemble learning has been proven to ...
Cost-Sensitive back-propagation neural networks with binarization techniques in addressing multi-class problems and non-competent classifiers

A novel method based on cost-sensitive neural networks with binarization techniques for multi-class problems is developed.The effect of aggregation methods for the proposed method is studied.The positive synergy between the management of non-competent ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

MSR '17: Proceedings of the 14th International Conference on Mining Software Repositories

May 2017

567 pages

ISBN:9781538615447

General Chair:
Jesus M. Gonzalez-Barahona
Universidad Rey Juan Carlos
,
Program Chairs:
Abram Hindle
University of Alberta
,
Lin Tan
University of Waterloo

Publisher

IEEE Press

Publication History

Published: 20 May 2017

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE '17

Sponsor:

SIGSOFT
IEEE-CS
SADIO

ICSE '17: 39th International Conference on Software Engineering

May 20 - 28, 2017

Buenos Aires, Argentina

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
142
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Hong YTantithamthavorn CPasuksmit JThongtanunam PFriedman AZhao XKrasikov Ad'Amorim M(2024)Practitioners’ Challenges and Perceptions of CI Build Failure Predictions at AtlassianCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663856(370-381)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663856
Wang GSun ZChen YZhao YLiang QHao DChristakis MPradel M(2024)Commit Artifact Preserving Build PredictionProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680356(1236-1248)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680356
Kola-Olawuyi AWeeraddana NNagappan MSpinellis DConstantinou EBacchelli A(2024)The Impact of Code Ownership of DevOps Artefacts on the Outcome of DevOps CI BuildsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644924(543-555)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644924
Sun GHabchi SMcIntosh S(2024)RavenBuild: Context, Relevance, and Dependency Aware Build Outcome PredictionProceedings of the ACM on Software Engineering10.1145/36437711:FSE(996-1018)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643771
Fallahzadeh EBavand ARigby PChandra SBlincoe KTonella P(2023)Accelerating Continuous Integration with Parallel Batch TestingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616255(55-67)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616255
Jin XServant F(2023)HybridCISave: A Combined Build and Test Selection Approach in Continuous IntegrationACM Transactions on Software Engineering and Methodology10.1145/357603832:4(1-39)Online publication date: 26-May-2023
https://dl.acm.org/doi/10.1145/3576038
Liu BZhang HMa WLi GLi SShen H(2023)The Why, When, What, and How About Predictive Continuous Integration: A Simulation-Based InvestigationIEEE Transactions on Software Engineering10.1109/TSE.2023.333051049:12(5223-5249)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1109/TSE.2023.3330510
Zhang CChen BHu JPeng XZhao W(2022)BuildSonic: Detecting and Repairing Performance-Related Configuration Smells for Continuous Integration BuildsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556923(1-13)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3556923
Jin XSpinellis DGousios GChechik MDi Penta M(2021)Reducing cost in continuous integration with a collection of build selection approachesProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3473103(1650-1654)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3468264.3473103
Liu BZhang HYang LDong LShen HSong KLi JJaccheri LDingsøyr TChitchyan R(2020)An Experimental Evaluation of Imbalanced Learning and Time-Series Validation in the Context of CI/CD PredictionProceedings of the 24th International Conference on Evaluation and Assessment in Software Engineering10.1145/3383219.3383222(21-30)Online publication date: 15-Apr-2020
https://dl.acm.org/doi/10.1145/3383219.3383222
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Cost-sensitive probability for weighted voting in an ensemble model for multi-class classification problems

Software defect prediction using tree-based ensembles

Cost-Sensitive back-propagation neural networks with binarization techniques in addressing multi-class problems and non-competent classifiers