research-article

Revisiting Test Impact Analysis in Continuous Testing From the Perspective of Code Dependencies

Authors:

Jinqiu YangAuthors Info & Claims

IEEE Transactions on Software Engineering, Volume 48, Issue 6

Pages 1979 - 1993

https://doi.org/10.1109/TSE.2020.3045914

Published: 01 June 2022 Publication History

Abstract

In continuous testing, developers execute automated test cases once or even several times per day to ensure the quality of the integrated code. Although continuous testing helps ensure the quality of the code and reduces maintenance effort, it also significantly increases test execution overhead. In this paper, we empirically evaluate the effectiveness of test impact analysis from the perspective of code dependencies in the continuous testing setting. We first applied test impact analysis to one year of software development history in 11 large-scale open-source systems. We found that even though the number of changed files is small in daily commits (median ranges from 3 to 28 files), around 50 percent or more of the test cases are still impacted and need to be executed. Motivated by our finding, we further studied the code dependencies between source code files and test cases, and among test cases. We found that 1) test cases often focus on testing the integrated behaviour of the systems and 15 percent of the test cases have dependencies with more than 20 source code files; 2) 18 percent of the test cases have dependencies with other test cases, and test case inheritance is the most common cause of test case dependencies; and 3) we documented four dependency-related test smells that we uncovered in our manual study. Our study provides the first step towards studying and understanding the effectiveness of test impact analysis in the continuous testing setting and provides insights on improving test design and execution.

References

[1]

M. Gligoric, L. Eloussi, and D. Marinov, “Practical regression test selection with dynamic file dependencies,” in Proc. Int. Symp. Softw. Testing Anal., 2015, pp. 211–222.

[2]

A. Shi, T. Yung, A. Gyori, and D. Marinov, “Comparing and combining test-suite reduction and regression test selection,” in Proc. 10th Joint Meeting Foundations Softw. Eng., 2015, pp. 237–247.

[3]

G. Rothermel and M. J. Harrold, “A safe, efficient regression test selection technique,” ACM Trans. Softw. Eng. Methodol., vol. 6, no. 2, pp. 173–210, 1997.

Digital Library

[4]

L. Zhang, D. Marinov, L. Zhang, and S. Khurshid, “An empirical study of junit test-suite reduction,” in Proc. IEEE 22nd Int. Symp. Softw. Rel. Eng., 2011, pp. 170–179.

[5]

A. Vahabzadeh, A. Stocco, and A. Mesbah, “Fine-grained test minimization,” in Proc. 40th Int. Conf. Softw. Eng., 2018, pp. 210–221.

[6]

S. Yoo and M. Harman, “Regression testing minimization, selection and prioritization: A survey,” Softw. Testing Verification Rel., vol. 22, no. 2, pp. 67–120, Mar. 2012.

Digital Library

[7]

A. Orso, N. Shi, and M. J. Harrold, “Scaling regression testing to large software systems,” in Proc. 12th ACM SIGSOFT 12th Int. Symp. Foundations Softw. Eng., 2004, pp. 241–251.

[8]

O. Legunsen, F. Hariri, A. Shi, Y. Lu, L. Zhang, and D. Marinov, “An extensive study of static regression test selection in modern software evolution,” in Proc. 24th ACM SIGSOFT Int. Symp. Foundations Softw. Eng., 2016, pp. 583–594.

[9]

Microsoft, “Test impact analysis in visual studio test,” Accessed: May2019. 2019. [Online]. Available: https://docs.microsoft.com/en-us/azure/devops/pipelines/test/test-impact-analysis?view=azure-devops

[10]

A. Jenkins, “Apache Jenkins CI test results,” Accessed: Nov.2019, 2019. [Online]. Available: https://builds.apache.org/

[11]

A. Najafi, W. Shang, and P. C. Rigby, “Improving test effectiveness using test executions history: An industrial experience report,” in Proc. 41st Int. Conf. Softw. Eng., 2019, pp. 213–222.

[12]

B. Vasilescu, Y. Yu, H. Wang, P. Devanbu, and V. Filkov, “Quality and productivity outcomes relating to continuous integration in github,” in Proc. 10th Joint Meeting Foundations Softw. Eng., 2015, pp. 805–816.

[13]

D. Saff and M. D. Ernst, “Reducing wasted development time via continuous testing,” in Proc. 14th Int. Symp. Softw. Rel. Eng., 2003, pp. 281–292. [Online]. Available: https://doi.org/10.1109/ISSRE.2003.1251050

[14]

K. Muslu, Y. Brun, and A. Meliou, “Data debugging with continuous testing,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp. Foundations Softw. Eng., 2013, pp. 631–634. [Online]. Available: https://doi.org/10.1145/2491411.2494580

[15]

T.-H. Chenet al., “Analytics-driven load testing: An industrial experience report on load testing of large-scale systems,” in Proc. 39th Int. Conf. Softw. Eng.: Softw. Eng. Practice Track, 2017, pp. 243–252.

[16]

A. Memonet al., “Taming google-scale continuous testing,” in Proc. 39th Int. Conf. Softw. Eng.: Softw. Eng. Practice Track, 2017, pp. 233–242. [Online]. Available: https://doi.org/10.1109/ICSE-SEIP.2017.16

[17]

J. Chen, Y. Bai, D. Hao, Y. Xiong, H. Zhang, and B. Xie, “Learning to prioritize test programs for compiler testing,” in Proc. 39th Int. Conf. Softw. Eng., 2017, pp. 700–711.

[18]

Z. Li, M. Harman, and R. M. Hierons, “Search algorithms for regression test case prioritization,” IEEE Trans. Softw. Eng., vol. 33, no. 4, pp. 225–237, Apr. 2007.

Digital Library

[19]

H. Mei, D. Hao, L. Zhang, L. Zhang, J. Zhou, and G. Rothermel, “A static approach to prioritizing junit test cases,” IEEE Trans. Softw. Eng., vol. 38, no. 6, pp. 1258–1275, Nov./Dec. 2012.

Digital Library

[20]

G. Rothermel, R. H. Untch, C. Chu, and M. J. Harrold, “Prioritizing test cases for regression testing,” IEEE Trans. Softw. Eng., vol. 27, no. 10, pp. 929–948, Oct. 2001.

Digital Library

[21]

R. K. Saha, L. Zhang, S. Khurshid, and D. E. Perry, “An information retrieval approach for regression test prioritization based on program changes,” in Proc. 37th Int. Conf. Softw. Eng., 2015, pp. 268–279.

[22]

S. W. Thomas, H. Hemmati, A. E. Hassan, and D. Blostein, “Static test case prioritization using topic models,” Empir. Softw. Eng., vol. 19, no. 1, pp. 182–212, Feb. 2014.

Digital Library

[23]

S. Yoo, M. Harman, and D. Clark, “Fault localization prioritization: Comparing information-theoretic and coverage-based approaches,” ACM Trans. Softw. Eng. Methodol., vol. 22, no. 3, pp. 19:1–19:29, Jul. 2013.

[24]

L. Zhang, D. Hao, L. Zhang, G. Rothermel, and H. Mei, “Bridging the gap between the total and additional test-case prioritization strategies,” in Proc. Int. Conf. Softw. Eng., 2013, pp. 192–201.

[25]

S. Elbaum, G. Rothermel, and J. Penix, “Techniques for improving regression testing in continuous integration development environments,” in Proc. 22nd ACM SIGSOFT Int. Symp. Foundations Softw. Eng., 2014, pp. 235–245.

[26]

D. Marijan, A. Gotlieb, and S. Sen, “Test case prioritization for continuous regression testing: An industrial case study,” in Proc. IEEE Int. Conf. Softw. Maintenance, 2013, pp. 540–543.

[27]

Y. Zhu, E. Shihab, and P. C. Rigby, “Test re-prioritization in continuous testing environments,” in Proc. 34th Int. Conf. Softw. Maintenance Evol., 2018, pp. 69–79.

[28]

Q. Luo, K. Moran, L. Zhang, and D. Poshyvanyk, “How do static and dynamic test case prioritization techniques perform on modern software systems? An extensive study on github projects,” IEEE Trans. Softw. Eng., vol. 45, no. 11, pp. 1054–1080, Nov. 2019.

[29]

E. Engström, P. Runeson, and M. Skoglund, “A systematic review on regression test selection techniques,” Inf. Softw. Technol., vol. 52, no. 1, pp. 14–30, 2010.

Digital Library

[30]

S. Zhanget al., “Empirically revisiting the test independence assumption,” in Proc. Int. Symp. Softw. Testing Anal., 2014, pp. 385–396. [Online]. Available: http://doi.acm.org/10.1145/2610384.2610404

[31]

A. Gambi, J. Bell, and A. Zeller, “Practical test dependency detection,” in Proc. 11th IEEE Int. Conf. Softw. Testing Verification Valid., 2018, pp. 1–11. [Online]. Available: https://doi.org/10.1109/ICST.2018.00011

[32]

D. Spadini, M. Aniche, M. Bruntink, and A. Bacchelli, “To mock or not to mock?: An empirical study on mocking practices,” in Proc. 14th Int. Conf. Mining Softw. Repositories, 2017, pp. 402–412.

[33]

L. S. Pinto, S. Sinha, and A. Orso, “Understanding myths and realities of test-suite evolution,” in Proc. ACM SIGSOFT 20th Int. Symp. Foundations Softw. Eng., 2012, pp. 33:1–33:11.

[34]

Q. Luo, F. Hariri, L. Eloussi, and D. Marinov, “An empirical analysis of flaky tests,” in Proc. 22nd ACM SIGSOFT Int. Symp. Foundations Softw. Eng., 2014, pp. 643–653.

[35]

A. Vahabzadeh, A. M. Fard, and A. Mesbah, “An empirical study of bugs in test code,” in Proc. IEEE Int. Conf. Softw. Maintenance Evol., 2015, pp. 101–110.

[36]

F. Palomba and A. Zaidman, “Does refactoring of test smells induce fixing flaky tests?,” in Proc. IEEE Int. Conf. Softw. Maintenance Evol., 2017, pp. 1–12.

[37]

JavaParser, Accessed: Feb.1, 2019, 2019. [Online]. Available: https://javaparser.org/

[38]

A. Vahabzadeh, A. M. Fard, and A. Mesbah, “An empirical study of bugs in test code,” in Proc. IEEE Int. Conf. Softw. Maintenance Evol., 2015, pp. 101–110.

[39]

A. Isazadeh, H. Izadkhah, and I. Elgedawy, Source Code Modularization: Theory and Techniques. Berlin, Germany: Springer, 2017.

Digital Library

[40]

E. Hautus, “Improving java software through package structure analysis,” in Proc. 6th IASTED Int. Conf. Softw. Eng. Appl., 2002, pp. 1–5.

[41]

S. Grant, J. R. Cordy, and D. B. Skillicorn, “Using heuristics to estimate an appropriate number of latent topics in source code analysis,” Sci. Comput. Program., vol. 78, no. 9, pp. 1663–1678, 2013.

[42]

S. Grant, J. R. Cordy, and D. B. Skillicorn, “Using topic models to support software maintenance,” in Proc. 16th Eur. Conf. Softw. Maintenance Reengineering, 2012, pp. 403–408.

[43]

D. Spadini, M. Aniche, M. Bruntink, and A. Bacchelli, “Mock objects for testing java systems,” Empir. Softw. Eng., Nov. 2018. [Online]. Available: https://doi.org/10.1007/s10664-018-9663-0

[44]

T. D. LaToza, G. Venolia, and R. DeLine, “Maintaining mental models: A study of developer work habits,” in Proc. 28th Int. Conf. Softw. Eng., 2006, pp. 492–501.

[45]

D. Janzen and H. Saiedian, “Test-driven development: Concepts, taxonomy, and future direction,” Computer, vol. 38, no. 9, pp. 43–50, Sep. 2005.

Digital Library

[46]

E. Daka and G. Fraser, “A survey on unit testing practices and problems,” in Proc. 25th Int. Symp. Softw. Rel. Eng., 2014, pp. 201–211.

[47]

A. Leitner, M. Oriol, A. Zeller, I. Ciupa, and B. Meyer, “Efficient unit test case minimization,” in Proc. 22nd IEEE/ACM Int. Conf. Automated Softw. Eng., 2007, pp. 417–420.

[48]

M. Ghafari, C. Ghezzi, and K. Rubinov, “Automatically identifying focal methods under test in unit test cases,” in Proc. IEEE 15th Int. Working Conf. Source Code Anal. Manipulation, 2015, pp. 61–70.

[49]

Y. Lei and J. H. Andrews, “Minimization of randomized unit test cases,” in Proc. 16th IEEE Int. Symp. Softw. Rel. Eng., 2005, pp. 267–276. [Online]. Available: https://doi.org/10.1109/ISSRE.2005.28

[50]

A. Qusef, R. Oliveto, and A. De Lucia, “Recovering traceability links between unit tests and classes under test: An improved method,” in Proc. IEEE Int. Conf. Softw. Maintenance, 2010, pp. 1–10.

[51]

P. Bouillon, J. Krinke, N. Meyer, and F. Steimann, “Ezunit: A framework for associating failed unit tests with potential programming errors,” in Proc. 8th Int. Conf. Agile Processes Softw. Eng. Extreme Program., 2007, pp. 101–104. [Online]. Available: http://dl.acm.org/citation.cfm?id=1768961.1768979

[52]

J. Shore, “Fail fast [software debugging],” IEEE Softw., vol. 21, no. 5, pp. 21–25, Sep./Oct. 2004.

Digital Library

[53]

S. Boslaugh and P. Watters, Statistics in a Nutshell: A Desktop Quick Reference. Newton, MA, USA: O’Reilly Media, 2008.

[54]

J. Sim and C. C. Wright, “The kappa statistic in reliability studies: Use, interpretation, and sample size requirements,” Phys. Ther., vol. 85, no. 3, pp. 257–268, Mar. 2005.

[55]

G. Meszaros, XUnit Test Patterns: Refactoring Test Code. Reading, MA, USA: Addison-Wesley, 2007.

Digital Library

[56]

Z. Li, T.-H. P. Chen, J. Yang, and W. Shang, “DLFinder: Characterizing and detecting duplicate logging code smells,” in Proc. 41th Int. Conf. Softw. Eng., 2019, pp. 147–149.

[57]

S. R. Chidamber and C. F. Kemerer, “A metrics suite for object oriented design,” IEEE Trans. Softw. Eng., vol. 20, no. 6, pp. 476–493, Jun. 1994.

Digital Library

[58]

L. Moonen and A. Yamashita, “Do code smells reflect important maintainability aspects?,” in Proc. IEEE Int. Conf. Softw. Maintenance, 2012, pp. 306–315.

[59]

“Three reasons why we should not use inheritance in our tests,” Accessed: May2019, 2019. [Online]. Available: https://www.petrikainulainen.net/programming/unit-testing/3-reasons-why-we-should-not-use-inheritance-in-our-tests/

[60]

L. Prechelt, B. Unger, M. Philippsen, and W. F. Tichy, “A controlled experiment on inheritance depth as a cost factor for code maintenance,” J. Syst. Softw., vol. 65, no. 2, pp. 115–126, 2003. [Online]. Available: https://doi.org/10.1016/S0164-1212(02)00053-5

[61]

F. Palomba and A. Zaidman, “The smell of fear: On the relation between test smells and flaky tests,” Empir. Softw. Eng., vol. 24, no. 5, pp. 2907–2946, 2019.

Digital Library

[62]

D. Spadini, F. Palomba, A. Zaidman, M. Bruntink, and A. Bacchelli, “On the relation of test smells to software code quality,” in Proc. Int. Conf. Softw. Maintenance Evol., 2018, pp. 1–12.

[63]

A. Van Deursen, L. Moonen, A. Van Den Bergh, and G. Kok, “Refactoring test code,” in Proc. 2nd Int. Conf. Extreme Program. Flexible Processes Softw. Eng., 2001, pp. 92–95.

[64]

“Test smell found in manual study,” 2020. [Online]. Available: https://docs.google.com/spreadsheets/d/1umM_zQeyHMPTyFTKnjqtq4apfYH7JA38d8z44YR5V8E/edit?usp=sharing

[65]

“Parameterized tests in JUnit,” Accessed: May2019. 2019. [Online]. Available: https://github.com/junit-team/junit4/wiki/parameterized-tests

[66]

L. Koskela, Effective Unit Testing: A guide for Java Developers. Shelter Island, NY, USA: Manning Publications, 2013.

[67]

J. Bell, O. Legunsen, M. Hilton, L. Eloussi, T. Yung, and D. Marinov, “Deflaker: Automatically detecting flaky tests,” in Proc. 40th Int. Conf. Softw. Eng., 2018, pp. 433–444.

[68]

M. Beller, G. Gousios, and A. Zaidman, “Oops, my tests broke the build: An explorative analysis of travis CI with GitHub,” in Proc. 14th Int. Conf. Mining Softw. Repositories, 2017, pp. 356–367. [Online]. Available: https://doi.org/10.1109/MSR.2017.62

[69]

T. Rausch, W. Hummer, P. Leitner, and S. Schulte, “An empirical analysis of build failures in the continuous integration workflows of java-based open-source software,” in Proc. 14th Int. Conf. Mining Softw. Repositories, 2017, pp. 345–355.

[70]

T. Lutellieret al., “Measuring the impact of code dependencies on software architecture recovery techniques,” IEEE Trans. Softw. Eng., vol. 44, no. 2, pp. 159–181, Feb. 2018.

Cited By

Kim DYang BYang JChen TSpinellis DGousios GChechik MDi Penta M(2021)How disabled tests manifest in test maintainability challenges?Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468609(1045-1055)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3468264.3468609

Index Terms

Revisiting Test Impact Analysis in Continuous Testing From the Perspective of Code Dependencies

Index terms have been assigned to the content through auto-classification.

Recommendations

Excluding code from test coverage: practices, motivations, and impact
Abstract
Test coverage measures the percentage of code that is covered (and uncovered) by tests. In practice, not all code is equally important for coverage analysis, like code that will not be executed during tests. Some coverage tools provide support for ...
Empirically revisiting the test independence assumption
ISSTA 2014: Proceedings of the 2014 International Symposium on Software Testing and Analysis

In a test suite, all the test cases should be independent: no test should affect any other test’s result, and running the tests in any order should produce the same test results. Techniques such as test prioritization generally assume that the tests in ...
Test Case Prioritization for Continuous Regression Testing: An Industrial Case Study
ICSM '13: Proceedings of the 2013 IEEE International Conference on Software Maintenance

Regression testing in continuous integration environment is bounded by tight time constraints. To satisfy time constraints and achieve testing goals, test cases must be efficiently ordered in execution. Prioritization techniques are commonly used to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering

IEEE Transactions on Software Engineering Volume 48, Issue 6

June 2022

355 pages

ISSN:0098-5589

Issue’s Table of Contents

0098-5589 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 June 2022

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kim DYang BYang JChen TSpinellis DGousios GChechik MDi Penta M(2021)How disabled tests manifest in test maintainability challenges?Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468609(1045-1055)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3468264.3468609

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents