research-article

Assessing transition-based test selection algorithms at Google

Authors:

Abhayendra Singh,

Mike Papadakis,

John MiccoAuthors Info & Claims

ICSE-SEIP '19: Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice

Pages 101 - 110

https://doi.org/10.1109/ICSE-SEIP.2019.00019

Published: 27 May 2019 Publication History

Abstract

Continuous Integration traditionally relies on testing every code commit with all impacted tests. This practice requires considerable computational resources, which at Google scale, results in delayed test results and high operational costs. To deal with this issue and provide fast feedback, test selection and prioritization methods aim to execute the tests which are most likely to reveal changes in test results as soon as possible. In this paper we present a simulation framework to support the study and evaluation, with real data, of such techniques. We propose a test selection algorithm evaluation method, and detail several practical requirements which are often ignored by related work, such as the detection of transitions, the collection and analysis of data, and the handling of flaky tests. Based on this framework, we design an experiment evaluating five potential regression test selection algorithms, based on simple heuristics and inspired by previous research, though the evaluation technique is applicable to any number of algorithms for future experiments. Our results show that algorithms based on the recent (transition) execution history do not perform as well as expected (given the previously reported results) and that the test selection problem remains largely open. We found that the best performing algorithms are based on the number of times a test has been triggered and the number of distinct authors committing code that triggers particular tests. More research is needed in order to close the gap between the current approaches and the optimal solution.

References

[1]

M. Fowler, "Continuous integration," https://martinfowler.com/articles/continuousIntegration.html, online; accessed 10 August 2018.

[2]

H. Esfahani, J. Fietz, Q. Ke, A. Kolomiets, E. Lan, E. Mavrinac, W. Schulte, N. Sanches, and S. Kandula, "Cloudbuild: Microsoft's distributed and caching build service," in ICSE, 2016, pp. 11--20.

Digital Library

[3]

M. Harman and P. O'Hearn, "From start-ups to scale-ups: Opportunities and open problems for static and dynamic program analysis," in SCAM, 2018.

[4]

A. M. Memon, Z. Gao, B. N. Nguyen, S. Dhanda, E. Nickell, R. Siemborski, and J. Micco, "Taming google-scale continuous testing," in ICSE-SEIP, 2017, pp. 233--242.

Digital Library

[5]

"Neflix ci," https://medium.com/netflix-techblog/towards-true-continuous-integration-distributed-repositories-and-dependencies-2a2e3108c051, online; accessed 10 August 2018.

[6]

A. Labuschagne, L. Inozemtseva, and R. Holmes, "Measuring the cost of regression testing in practice: a study of java projects using continuous integration," in ESEC/FSE, 2017, pp. 821--830.

Digital Library

[7]

M. Hilton, N. Nelson, T. Tunnell, D. Marinov, and D. Dig, "Tradeoffs in continuous integration: assurance, security, and flexibility," in ESEC/FSE, 2017, pp. 197--207.

Digital Library

[8]

"Travis ci," http://travis-ci.org, online; accessed 10 August 2018.

[9]

"Jenkins," https://wiki.Jenkins-ci.org/display/JEKINS/Home/, online; accessed 10 August 2018.

[10]

A. Çelik, A. Knaust, A. Milicevic, and M. Gligoric, "Build system with lazy retrieval for java projects," in FSE, 2016, pp. 643--654.

Digital Library

[11]

A. Gambi, R. Zabolotnyi, and S. Dustdar, "Poster: Improving cloud-based continuous integration environments," in ICSE, 2015, pp. 797--798.

Digital Library

[12]

S. G. Elbaum, G. Rothermel, and J. Penix, "Techniques for improving regression testing in continuous integration development environments," in FSE, 2014, pp. 235--245.

[13]

J. Liang, S. G. Elbaum, and G. Rothermel, "Redefining prioritization: continuous prioritization for continuous integration," in ICSE, 2018, pp. 688--698.

Digital Library

[14]

S. Yoo and M. Harman, "Regression testing minimization, selection and prioritization: a survey," Softw. Test., Verif. Reliab., vol. 22, no. 2, pp. 67--120, 2012.

Digital Library

[15]

C. Henard, M. Papadakis, M. Harman, Y. Jia, and Y. L. Traon, "Comparing white-box and black-box test prioritization," in ICSE, 2016, pp. 523--534.

Digital Library

[16]

C. Henard, M. Papadakis, G. Perrouin, J. Klein, P. Heymans, and Y. L. Traon, "Bypassing the combinatorial explosion: Using similarity to generate and prioritize t-wise test configurations for software product lines," IEEE Trans. Software Eng., vol. 40, no. 7, pp. 650--670, 2014.

Digital Library

[17]

Q. Luo, F. Hariri, L. Eloussi, and D. Marinov, "An empirical analysis of flaky tests," in FSE, 2014, pp. 643--653.

Digital Library

[18]

J. Micco and A. Memon, "Gtac 2016, test flakiness google," https://docs.google.com/presentation/d/1iVf-TogkdoIcvs8OpRMMWx76s9Zk4_f0JJ-e1sZIxog/edit#slide=id.p659, online; accessed 10 September 2018.

[19]

C. Chambers, A. Raniwala, F. Perry, S. Adams, R. Henry, R. Bradshaw, and Nathan, "Flumejava: Easy, efficient data-parallel pipelines," in PLDI, 2010, pp. 363--375.

Digital Library

[20]

L. Zhang, D. Hao, L. Zhang, G. Rothermel, and H. Mei, "Bridging the gap between the total and additional test-case prioritization strategies," in ICSE, 2013, pp. 192--201.

Digital Library

[21]

T. L. Graves, M. J. Harrold, J. Kim, A. A. Porter, and G. Rothermel, "An empirical study of regression test selection techiques," ACM Trans. Softw. Eng. Methodol., vol. 10, no. 2, pp. 184--208, 2001.

Digital Library

[22]

L. Zhang, "Hybrid regression test selection," in ICSE, 2018, pp. 199--209.

Digital Library

[23]

J. Liang, "Cost-effective techniques for continuous integration testing," Master's thesis, University of Nebraska-Lincoln, Lincoln, Nebraska, United States, 2018.

[24]

B. Busjaeger and T. Xie, "Learning for test prioritization: an industrial case study," in FSE, 2016, pp. 975--980.

Digital Library

[25]

H. Spieker, A. Gotlieb, D. Marijan, and M. Mossige, "Reinforcement learning for automatic test case prioritization and selection in continuous integration," in Software Engineering und Software Management 2018, Fachtagung des GI-Fachbereichs Softwaretechnik, SE 2018, 5.-9. März 2018, Ulm, Germany., 2018, pp. 75--76.

[26]

Y. Zhu, E. Shihab, and P. C. Rigby, "Test re-prioritization in continuous testing environments," in ICSME'18, 2018.

Cited By

Haben GHabchi SMicco JHarman MPapadakis MCordy MLe Traon YFilkov VRay BZhou M(2024)The Importance of Accounting for Execution Failures when Predicting Test FlakinessProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695261(1979-1989)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695261
Greca RMiranda BBertolino A(2023)State of Practical Applicability of Regression Testing Research: A Live Systematic Literature ReviewACM Computing Surveys10.1145/357985155:13s(1-36)Online publication date: 13-Jul-2023
https://dl.acm.org/doi/10.1145/3579851
Elsner DBertagnolli DPretschner AKlaus RGarrido AWong WDe Angelis GDo HNguyen B(2022)Challenges in regression test selection for end-to-end testing of microservice-based software systemsProceedings of the 3rd ACM/IEEE International Conference on Automation of Software Test10.1145/3524481.3527217(1-5)Online publication date: 17-May-2022
https://dl.acm.org/doi/10.1145/3524481.3527217
Show More Cited By

Recommendations

Test Case Prioritization for Continuous Regression Testing: An Industrial Case Study
ICSM '13: Proceedings of the 2013 IEEE International Conference on Software Maintenance

Regression testing in continuous integration environment is bounded by tight time constraints. To satisfy time constraints and achieve testing goals, test cases must be efficiently ordered in execution. Prioritization techniques are commonly used to ...
Reinforcement learning for automatic test case prioritization and selection in continuous integration
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

Testing in Continuous Integration (CI) involves test case prioritization, selection, and execution at each cycle. Selecting the most promising test cases to detect bugs is hard if there are uncertainties on the impact of committed code changes or, if ...
On the Impact of Hitting System Resource Limits on Test Flakiness
FTW '24: Proceedings of the 1st International Workshop on Flaky Tests

Regression testing aims to determine whether a change to a system introduces new bugs or can be merged safely. Flaky tests, which are tests that fail non-deterministically and unrelated to the change, can undermine this effort. In research and practice ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE-SEIP '19: Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice

May 2019

339 pages

Conference Chairs:
Helen Sharp
The Open University, UK
,
Mike Whalen
Amazon Inc.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering
IEEE-CS: Computer Society

Publisher

IEEE Press

Publication History

Published: 27 May 2019

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE '19

Sponsor:

SIGSOFT
IEEE-CS

ICSE '19: 41st International Conference on Software Engineering

May 27, 2019

Quebec, Montreal, Canada

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
149
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Haben GHabchi SMicco JHarman MPapadakis MCordy MLe Traon YFilkov VRay BZhou M(2024)The Importance of Accounting for Execution Failures when Predicting Test FlakinessProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695261(1979-1989)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695261
Greca RMiranda BBertolino A(2023)State of Practical Applicability of Regression Testing Research: A Live Systematic Literature ReviewACM Computing Surveys10.1145/357985155:13s(1-36)Online publication date: 13-Jul-2023
https://dl.acm.org/doi/10.1145/3579851
Elsner DBertagnolli DPretschner AKlaus RGarrido AWong WDe Angelis GDo HNguyen B(2022)Challenges in regression test selection for end-to-end testing of microservice-based software systemsProceedings of the 3rd ACM/IEEE International Conference on Automation of Software Test10.1145/3524481.3527217(1-5)Online publication date: 17-May-2022
https://dl.acm.org/doi/10.1145/3524481.3527217
Cordy MRwemalika RFranci APapadakis MHarman MDwyer MDamian DZeller A(2022)FlakiMeProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510194(982-994)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510003.3510194
Parry OKapfhammer GHilton MMcMinn P(2021)A Survey of Flaky TestsACM Transactions on Software Engineering and Methodology10.1145/347610531:1(1-74)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3476105
Mehta SFarmahinifarahani FBhagwan RGuptha SJafari SKumar RSaini VSanthiar ASpinellis DGousios GChechik MDi Penta M(2021)Data-driven test selection at scaleProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3473916(1225-1235)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3468264.3473916
Pan CPradel MCadar CZhang X(2021)Continuous test suite failure predictionProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464840(553-565)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3460319.3464840
Elsner DHauer FPretschner AReimer SCadar CZhang X(2021)Empirically evaluating readily available information for regression test optimization in continuous integrationProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464834(491-504)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3460319.3464834
Ma WChekam TPapadakis MHarman M(2021)MuDeltaProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00086(897-909)Online publication date: 22-May-2021
https://dl.acm.org/doi/10.1109/ICSE43902.2021.00086
Jin XServant F(2021)What helped, and what did not?Proceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00031(213-225)Online publication date: 22-May-2021
https://dl.acm.org/doi/10.1109/ICSE43902.2021.00031

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents