skip to main content
10.1109/ICSE-SEIP.2019.00019acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Assessing transition-based test selection algorithms at Google

Published: 27 May 2019 Publication History

Abstract

Continuous Integration traditionally relies on testing every code commit with all impacted tests. This practice requires considerable computational resources, which at Google scale, results in delayed test results and high operational costs. To deal with this issue and provide fast feedback, test selection and prioritization methods aim to execute the tests which are most likely to reveal changes in test results as soon as possible. In this paper we present a simulation framework to support the study and evaluation, with real data, of such techniques. We propose a test selection algorithm evaluation method, and detail several practical requirements which are often ignored by related work, such as the detection of transitions, the collection and analysis of data, and the handling of flaky tests. Based on this framework, we design an experiment evaluating five potential regression test selection algorithms, based on simple heuristics and inspired by previous research, though the evaluation technique is applicable to any number of algorithms for future experiments. Our results show that algorithms based on the recent (transition) execution history do not perform as well as expected (given the previously reported results) and that the test selection problem remains largely open. We found that the best performing algorithms are based on the number of times a test has been triggered and the number of distinct authors committing code that triggers particular tests. More research is needed in order to close the gap between the current approaches and the optimal solution.

References

[1]
M. Fowler, "Continuous integration," https://martinfowler.com/articles/continuousIntegration.html, online; accessed 10 August 2018.
[2]
H. Esfahani, J. Fietz, Q. Ke, A. Kolomiets, E. Lan, E. Mavrinac, W. Schulte, N. Sanches, and S. Kandula, "Cloudbuild: Microsoft's distributed and caching build service," in ICSE, 2016, pp. 11--20.
[3]
M. Harman and P. O'Hearn, "From start-ups to scale-ups: Opportunities and open problems for static and dynamic program analysis," in SCAM, 2018.
[4]
A. M. Memon, Z. Gao, B. N. Nguyen, S. Dhanda, E. Nickell, R. Siemborski, and J. Micco, "Taming google-scale continuous testing," in ICSE-SEIP, 2017, pp. 233--242.
[5]
"Neflix ci," https://medium.com/netflix-techblog/towards-true-continuous-integration-distributed-repositories-and-dependencies-2a2e3108c051, online; accessed 10 August 2018.
[6]
A. Labuschagne, L. Inozemtseva, and R. Holmes, "Measuring the cost of regression testing in practice: a study of java projects using continuous integration," in ESEC/FSE, 2017, pp. 821--830.
[7]
M. Hilton, N. Nelson, T. Tunnell, D. Marinov, and D. Dig, "Tradeoffs in continuous integration: assurance, security, and flexibility," in ESEC/FSE, 2017, pp. 197--207.
[8]
"Travis ci," http://travis-ci.org, online; accessed 10 August 2018.
[9]
"Jenkins," https://wiki.Jenkins-ci.org/display/JEKINS/Home/, online; accessed 10 August 2018.
[10]
A. Çelik, A. Knaust, A. Milicevic, and M. Gligoric, "Build system with lazy retrieval for java projects," in FSE, 2016, pp. 643--654.
[11]
A. Gambi, R. Zabolotnyi, and S. Dustdar, "Poster: Improving cloud-based continuous integration environments," in ICSE, 2015, pp. 797--798.
[12]
S. G. Elbaum, G. Rothermel, and J. Penix, "Techniques for improving regression testing in continuous integration development environments," in FSE, 2014, pp. 235--245.
[13]
J. Liang, S. G. Elbaum, and G. Rothermel, "Redefining prioritization: continuous prioritization for continuous integration," in ICSE, 2018, pp. 688--698.
[14]
S. Yoo and M. Harman, "Regression testing minimization, selection and prioritization: a survey," Softw. Test., Verif. Reliab., vol. 22, no. 2, pp. 67--120, 2012.
[15]
C. Henard, M. Papadakis, M. Harman, Y. Jia, and Y. L. Traon, "Comparing white-box and black-box test prioritization," in ICSE, 2016, pp. 523--534.
[16]
C. Henard, M. Papadakis, G. Perrouin, J. Klein, P. Heymans, and Y. L. Traon, "Bypassing the combinatorial explosion: Using similarity to generate and prioritize t-wise test configurations for software product lines," IEEE Trans. Software Eng., vol. 40, no. 7, pp. 650--670, 2014.
[17]
Q. Luo, F. Hariri, L. Eloussi, and D. Marinov, "An empirical analysis of flaky tests," in FSE, 2014, pp. 643--653.
[18]
J. Micco and A. Memon, "Gtac 2016, test flakiness google," https://docs.google.com/presentation/d/1iVf-TogkdoIcvs8OpRMMWx76s9Zk4_f0JJ-e1sZIxog/edit#slide=id.p659, online; accessed 10 September 2018.
[19]
C. Chambers, A. Raniwala, F. Perry, S. Adams, R. Henry, R. Bradshaw, and Nathan, "Flumejava: Easy, efficient data-parallel pipelines," in PLDI, 2010, pp. 363--375.
[20]
L. Zhang, D. Hao, L. Zhang, G. Rothermel, and H. Mei, "Bridging the gap between the total and additional test-case prioritization strategies," in ICSE, 2013, pp. 192--201.
[21]
T. L. Graves, M. J. Harrold, J. Kim, A. A. Porter, and G. Rothermel, "An empirical study of regression test selection techiques," ACM Trans. Softw. Eng. Methodol., vol. 10, no. 2, pp. 184--208, 2001.
[22]
L. Zhang, "Hybrid regression test selection," in ICSE, 2018, pp. 199--209.
[23]
J. Liang, "Cost-effective techniques for continuous integration testing," Master's thesis, University of Nebraska-Lincoln, Lincoln, Nebraska, United States, 2018.
[24]
B. Busjaeger and T. Xie, "Learning for test prioritization: an industrial case study," in FSE, 2016, pp. 975--980.
[25]
H. Spieker, A. Gotlieb, D. Marijan, and M. Mossige, "Reinforcement learning for automatic test case prioritization and selection in continuous integration," in Software Engineering und Software Management 2018, Fachtagung des GI-Fachbereichs Softwaretechnik, SE 2018, 5.-9. März 2018, Ulm, Germany., 2018, pp. 75--76.
[26]
Y. Zhu, E. Shihab, and P. C. Rigby, "Test re-prioritization in continuous testing environments," in ICSME'18, 2018.

Cited By

View all
  • (2024)The Importance of Accounting for Execution Failures when Predicting Test FlakinessProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695261(1979-1989)Online publication date: 27-Oct-2024
  • (2023)State of Practical Applicability of Regression Testing Research: A Live Systematic Literature ReviewACM Computing Surveys10.1145/357985155:13s(1-36)Online publication date: 13-Jul-2023
  • (2022)Challenges in regression test selection for end-to-end testing of microservice-based software systemsProceedings of the 3rd ACM/IEEE International Conference on Automation of Software Test10.1145/3524481.3527217(1-5)Online publication date: 17-May-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE-SEIP '19: Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice
May 2019
339 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 27 May 2019

Check for updates

Author Tags

  1. continuous integration
  2. regression testing

Qualifiers

  • Research-article

Conference

ICSE '19
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)The Importance of Accounting for Execution Failures when Predicting Test FlakinessProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695261(1979-1989)Online publication date: 27-Oct-2024
  • (2023)State of Practical Applicability of Regression Testing Research: A Live Systematic Literature ReviewACM Computing Surveys10.1145/357985155:13s(1-36)Online publication date: 13-Jul-2023
  • (2022)Challenges in regression test selection for end-to-end testing of microservice-based software systemsProceedings of the 3rd ACM/IEEE International Conference on Automation of Software Test10.1145/3524481.3527217(1-5)Online publication date: 17-May-2022
  • (2022)FlakiMeProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510194(982-994)Online publication date: 21-May-2022
  • (2021)A Survey of Flaky TestsACM Transactions on Software Engineering and Methodology10.1145/347610531:1(1-74)Online publication date: 26-Oct-2021
  • (2021)Data-driven test selection at scaleProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3473916(1225-1235)Online publication date: 20-Aug-2021
  • (2021)Continuous test suite failure predictionProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464840(553-565)Online publication date: 11-Jul-2021
  • (2021)Empirically evaluating readily available information for regression test optimization in continuous integrationProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464834(491-504)Online publication date: 11-Jul-2021
  • (2021)MuDeltaProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00086(897-909)Online publication date: 22-May-2021
  • (2021)What helped, and what did not?Proceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00031(213-225)Online publication date: 22-May-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media