skip to main content
10.1145/3611643.3613089acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

Keeping Mutation Test Suites Consistent and Relevant with Long-Standing Mutants

Published: 30 November 2023 Publication History

Abstract

Mutation testing has been demonstrated to be one of the most powerful fault-revealing tools in the tester's tool kit. Much previous work implicitly assumed it to be sufficient to re-compute mutant suites per release. Sadly, this makes mutation results inconsistent; mutant scores from each release cannot be directly compared, making it harder to measure test improvement. Furthermore, regular code change means that a mutant suite's relevance will naturally degrade over time. We measure this degradation in relevance for 143,500 mutants in 4 non-trivial systems, finding that 52% degrade, on average. We introduce a mutant brittleness measure and use it to audit software systems and their mutation suites. We also demonstrate how consistent-by-construction long-standing mutant suites can be identified with a 10x improvement in mutant relevance over an arbitrary test suite. Our results indicate that the research community should avoid the re-computation of mutant suites and focus, instead, on long-standing mutants, thereby improving the consistency and relevance of mutation testing.

References

[1]
Wasif Afzal and Richard Torkar. 2011. On the application of genetic programming for software engineering predictive modeling: A systematic review. Expert Systems Applications, 38, 9 (2011), 11984–11997.
[2]
Paul Ammann, Marcio Eduardo Delamaro, and Jeff Offutt. 2014. Establishing theoretical minimal sets of mutants. In 2014 IEEE seventh international conference on software testing, verification and validation. 21–30.
[3]
David Binkley, Nicolas Gold, Mark Harman, Syed Islam, Jens Krinke, and Shin Yoo. 2014. ORBS: Language-Independent Program Slicing. In 22^nd ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE 2014). Hong Kong, China. 109–120.
[4]
Mark Anthony Cachia, Mark Micallef, and Christian Colombo. 2013. Towards incremental mutation testing. Electronic Notes in Theoretical Computer Science, 294 (2013), 2–11.
[5]
Thierry Titcheu Chekam, Mike Papadakis, and Yves Le Traon. 2019. Mart: a mutant generation tool for LLVM. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26-30, 2019. ACM, 1080–1084. https://doi.org/10.1145/3338906.3341180
[6]
Thierry Titcheu Chekam, Mike Papadakis, Yves Le Traon, and Mark Harman. 2017. An Empirical Study on Mutation, Statement and Branch Coverage Fault Revelation that Avoids the Unreliable Clean Program Assumption. IEEE/ACM International Conference on Software Engineering.
[7]
Henry Coles, Thomas Laurent, Christopher Henard, Mike Papadakis, and Anthony Ventresque. 2016. PIT: a practical mutation testing tool for Java (demo). In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016. ACM, 449–452. https://doi.org/10.1145/2931037.2948707
[8]
Aayush Garg, Milos Ojdanic, Renzo Degiovanni, Thierry Titcheu Chekam, Mike Papadakis, and Yves Le Traon. 2022. Cerebro: Static Subsuming Mutant Selection. IEEE Transactions on Software Engineering, 1–1. https://doi.org/10.1109/TSE.2022.3140510
[9]
Git. 2022. Git-Diff. https://git-scm.com/docs/git-diff
[10]
Claire Le Goues, Michael Pradel, and Abhik Roychoudhury. 2019. Automated program repair. Commun. ACM, 62, 12 (2019), 56–65.
[11]
Mark Harman, Edmund Burke, John A. Clark, and Xin Yao. 2012. Dynamic Adaptive Search Based Software Engineering (Keynote Paper). In 6^th IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2012). Lund, Sweden. 1–8.
[12]
Daniel Jackson and Eugene J. Rollins. 1994. A New Model of Program Dependences for Reverse Engineering. In Symposium on the Foundations of Software Engineering (FSE ’94). 2–10.
[13]
Yue Jia and Mark Harman. 2009. Higher order mutation testing. Information and Software Technology, 51, 10 (2009), 1379–1393.
[14]
Yue Jia and Mark Harman. 2009. Higher Order Mutation Testing. Journal of Information and Software Technology, 51, 10 (2009), 1379–1393.
[15]
Yue Jia and Mark Harman. 2011. An Analysis and Survey of the Development of Mutation Testing. IEEE Transactions on Software Engineering, 37, 5 (2011), September–October, 649 – 678.
[16]
Marinos Kintis, Mike Papadakis, and Nicos Malevris. 2010. Evaluating Mutation Testing Alternatives: A Collateral Experiment. In 17th Asia Pacific Software Engineering Conference, APSEC 2010, Sydney, Australia, November 30 - December 3, 2010. IEEE Computer Society, 300–309. https://doi.org/10.1109/APSEC.2010.42
[17]
Bob Kurtz, Paul Ammann, Jeff Offutt, and Mariet Kurtz. 2016. Are We There Yet? How Redundant and Equivalent Mutants Affect Determination of Test Completeness. IEEE International Conference on Software Testing, Verification and Validation, 142–151. isbn:9781509018260 https://doi.org/10.1109/ICSTW.2016.41
[18]
William B. Langdon and Mark Harman. 2010. Evolving a CUDA Kernel from an nVidia Template. In 2010 IEEE World Congress on Computational Intelligence, Pilar Sobrevilla (Ed.). IEEE, Barcelona. 2376–2383. https://doi.org/
[19]
Wei Ma, Thierry Titcheu Chekam, Mike Papadakis, and Mark Harman. 2021. MuDelta: Delta-Oriented Mutation Testing at Commit Time. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 897–909. https://doi.org/10.1109/ICSE43902.2021.00086
[20]
Alexandru Marginean, Johannes Bader, Satish Chandra, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, and Andrew Scott. 2019. SapFix: Automated End-to-End Repair at Scale. In International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) track. Montreal, Canada.
[21]
Kevin Moran, Michele Tufano, Carlos Bernal-Cárdenas, Mario Linares-Vásquez, Gabriele Bavota, Christopher Vendome, Massimiliano Di Penta, and Denys Poshyvanyk. 2018. Mdroid+: A mutation testing framework for Android. In 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion). 33–36.
[22]
A.J. Offutt, G. Rothermel, and C. Zapf. 1993. An experimental evaluation of selective mutation. In Proceedings of 1993 15th International Conference on Software Engineering. 100–107. https://doi.org/10.1109/ICSE.1993.346062
[23]
Milos Ojdanic, Wei Ma, Thomas Laurent, Thierry Titcheu Chekam, Anthony Ventresque, and Mike Papadakis. 2022. On the use of commit-relevant mutants. Empir. Softw. Eng., 27, 5, 114. https://doi.org/10.1007/s10664-022-10138-1
[24]
Milos Ojdanic, Ezekiel Soremekun, Renzo Degiovanni, Mike Papadakis, and Yves Le Traon. 2022. Mutation Testing in Evolving Systems: Studying the Relevance of Mutants to Code Evolution. ACM Trans. Softw. Eng. Methodol., apr, issn:1049-331X https://doi.org/10.1145/3530786 Just Accepted.
[25]
Elmahdi Omar, Sudipto Ghosh, and Darrell Whitley. 2013. Constructing subtle higher order mutants for Javer and AspectJ programs. In International Symposium on Software Reliability Engineering (ISSRE’13). IEEE, 340–349.
[26]
Mike Papadakis, Thierry Titcheu Chekam, and Yves Le Traon. 2018. Mutant Quality Indicators. In 2018 IEEE International Conference on Software Testing, Verification and Validation Workshops, ICST Workshops, Västerås, Sweden, April 9-13, 2018. IEEE Computer Society, 32–39. https://doi.org/10.1109/ICSTW.2018.00025
[27]
Mike Papadakis, Yue Jia, Mark Harman, and Yves Le Traon. 2015. Trivial Compiler Equivalence: A Large Scale Empirical Study of a Simple, Fast and Effective Equivalent Mutant Detection Technique. In 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1. IEEE Computer Society, 936–946. https://doi.org/10.1109/ICSE.2015.103
[28]
Mike Papadakis, Marinos Kintis, Jie Zhang, Yue Jia, Yves Le Traon, and Mark Harman. 2019. Chapter Six - Mutation Testing Advances: An Analysis and Survey. Advances in Computers, Vol. 112. Elsevier, 275–378. issn:0065-2458 https://doi.org/10.1016/bs.adcom.2018.03.015
[29]
Justyna Petke, Saemundur O. Haraldsson, Mark Harman, William B. Langdon, David R. White, and John R. Woodward. 2018. Genetic Improvement of Software: a Comprehensive Survey. IEEE Transactions on Evolutionary Computation, 22, 3 (2018), June, 415–432. https://doi.org/
[30]
August Shi, Jonathan Bell, and Darko Marinov. 2019. Mitigating the effects of flaky tests on mutation testing. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019, Beijing, China, July 15-19, 2019, Dongmei Zhang and Anders Møller (Eds.). ACM, 112–122. https://doi.org/10.1145/3293882.3330568
[31]
Xiangjuan Yao, Mark Harman, and Yue Jia. 2014. A Study of Equivalent and Stubborn Mutation Operators Using Human Analysis of Equivalence. In Proceedings of the 36th International Conference on Software Engineering (IEEE/ACM International Conference on Software Engineering 2014). Association for Computing Machinery, New York, NY, USA. 919–930. isbn:9781450327565 https://doi.org/10.1145/2568225.2568265
[32]
Jie Zhang, Lingming Zhang, Mark Harman, Dan Hao, Yue Jia, and Lu Zhang. 2019. Predictive Mutation Testing. IEEE Transactions on Software Engineering, 45, 9 (2019), 898–918. https://doi.org/10.1109/TSE.2018.2809496
[33]
Lingming Zhang and Darko Marinov. 2012. Regression Mutation Testing. The ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 341. isbn:9781450314541

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2023
2215 pages
ISBN:9798400703270
DOI:10.1145/3611643
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Continuous Integration
  2. Evolving Systems
  3. Mutation Testing
  4. Software Testing
  5. Test Adequacy

Qualifiers

  • Research-article

Funding Sources

  • Luxembourg National Research Funds (FNR)

Conference

ESEC/FSE '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 138
    Total Downloads
  • Downloads (Last 12 months)138
  • Downloads (Last 6 weeks)30
Reflects downloads up to 13 Sep 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media