research-article

Using code reviews to automatically configure static analysis tools

Authors:

Fiorella Zampetti,

Saghan Mudbhari,

Venera Arnaoudova,

Massimiliano Di Penta,

Sebastiano Panichella,

Giuliano AntoniolAuthors Info & Claims

Empirical Software Engineering, Volume 27, Issue 1

https://doi.org/10.1007/s10664-021-10076-4

Published: 01 January 2022 Publication History

Abstract

Developers often use Static Code Analysis Tools (SCAT) to automatically detect different kinds of quality flaws in their source code. Since many warnings raised by SCATs may be irrelevant for a project/organization, it can be possible to leverage information from the project development history, to automatically configure which warnings a SCAT should raise, and which not. In this paper, we propose an automated approach (Auto-SCAT) to leverage (statement-level) code review comments for recommending SCAT warnings, or warning categories, to be enabled. To this aim, we trace code review comments onto SCAT warnings by leveraging their descriptions and messages, as well as review comments made in other different projects. We apply Auto-SCAT to study how CheckStyle, a well-known SCAT, can be configured in the context of six Java open source projects, all using Gerrit for handling code reviews. Our results show that, Auto-SCAT is able to classify code review comments into CheckStyle checks with a precision of 61% and a recall of 52%. While considering also the code review comments not related to CheckStyle warnings Auto-SCAT has a precision and a recall of ≈ 75%. Furthermore, Auto-SCAT can configuring CheckStyle with a precision of 72.7% at checks level and a precision of 96.3% at category level. Finally, our findings highlight that Auto-SCAT outperforms state-of-art baselines based on default CheckStyle configurations, or leveraging the history of previously-removed warnings.

References

[1]

Anderson P, Reps T, Teitelbaum T, Zarins M (2003) Tool support for fine-grained software inspection. In: IEEE Software

[2]

Ayewah N, Pugh W (2009) Using checklists to review static analysis warnings. In: Proceedings of the International Workshop on Defects in Large Software Systems: Held in Conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2009), pp 11–15

[3]

Bacchelli A, Bird C (2013) Expectations, outcomes, and challenges of modern code review. In: Proceedings of the International Conference on Software Engineering (ICSE), pp 712–721

[4]

Baeza-Yates RA and Ribeiro-Neto B Modern information retrieval 1999 UBoston Addison-Wesley longman publishing co. inc

[5]

Bavota G, Russo B (2015) Four eyes are better than two: on the impact of code reviews on software quality. In: IEEE International Conference on Software Maintenance and Evolution, (ICSME)

[6]

Baysal O, Kononenko O, Holmes R, Godfrey M (2013) The influence of non-technical factors on code review. In: Reverse Engineering (WCRE), 2013 20th Working Conference on

[7]

Beller M, Bacchelli A, Zaidman A, Juergens E (2014) Modern code reviews in open-source projects: Which problems do they fix?. In: Proceedings of the Working Conference on Mining Software Repositories (MSR), pp 202–211

[8]

Beller M, Bholanath R, McIntosh S, Zaidman A (2016) Analyzing the state of static analysis: A large-scale evaluation in open source software. In: Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp 470–481

[9]

Bosu A (2014) Characteristics of the vulnerable code changes identified through peer code review. In: 36Th international conference on software engineering, ICSE ’14, companion proceedings, hyderabad, india, may 31 - june 07, 2014, pp 736–738

[10]

Bosu A, Carver JC, Bird C, Orbeck J, Chockley C (2017) Process aspects and social dynamics of contemporary code review. Insights from open source development and industrial practice at microsoft. IEEE Transactions on Software Engineering

[11]

Cassee N, Vasilescu B, Serebrenik A (2020) The silent helper: The impact of continuous integration on code reviews. In: 27Th IEEE international conference on software analysis, evolution and reengineering, SANER 2020, london, ON, Canada, February 18-21, 2020, pp 423–434

[12]

Couto C, Montandon JE, Silva C, and Valente MT Static correspondence and correlation between field defects and warnings reported by a bug finding tool Softw Qual J 2013 21 2 241-257

[13]

Duvall P, Matyas SM, Glover A (2007) Continuous Integration. Improving Software Quality and Reducing Risk (The Addison-Wesley Signature Series). Addison-Wesley Professional

[14]

Fagan M (1976) Design and code inspections to reduce errors in program development. IBM Systems Journal

[15]

Fry Z, Weimer W (2013) Clustering static analysis defect reports to reduce maintenance costs. In: Proceedings of the Working Conference on Reverse Engineering (WCRE)

[16]

Hanam Q, Tan L, Holmes R, Lam P (2014) Finding patterns in static analysis alerts: Improving actionable alert ranking. In: Proceedings of the Working Conference on Mining Software Repositories

[17]

Huang A (2008) Similarity measures for text document clustering. In: New Zealand Computer Science Research Student Conference

[18]

Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs?. In: Proceedings of the International Conference on Software Engineering (ICSE), pp 672–681

[19]

Khoo YP, Foster JS, Hicks M, Sazawal V (2008) Path projection for user-centered static analysis tools. In: Proceedings of the ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, pp 57–63

[20]

Kim S, Ernst M (2007) Which warnings should I fix first?. In: Proceedings of the Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 45–54

[21]

Kononenko O, Baysal O, Godfrey MW (2016) Code review quality: How developers see it. In: Proceedings of the 38th International Conference on Software Engineering

[22]

Manning CD, Raghavan P, and Schütze H Introduction to Information Retrieval 2008 Cambridge Cambridge University Press

[23]

Mäntylä M and Lassenius C What types of defects are really discovered in code reviews? IEEE Trans Software Eng 2009 35 3 430-448

[24]

Marcilio D, Bonifacio R, Monteiro E, Canedo E, Luz W, Pinto G (2019) Are static analysis violations really fixed? a closer look at realistic usage of sonarqube. In: International Conference on Program Comprehension(ICPC)

[25]

McIntosh S, Kamei Y, Adams B, and Hassan AE An empirical study of the impact of modern code review practices on software quality Empir Softw Eng 2016 21 5 2146-2189

[26]

Morales R, McIntosh S, Khomh F (2015) Do code review practices impact design quality? a case study of the qt, vtk, and itk projects. In: Proc. of the 22nd Int’l Conf. on Software Analysis, Evolution, and Reengineering (SANER)

[27]

Muske T, Baid A, Sanas T (2013) Review efforts reduction by partitioning of static analysis warnings. In: Proceedings of the International Working Conference on Source Code Analysis and Manipulation (SCAM)

[28]

Cousot P, Cousot R, Feret J, Mauborgne L, Monniaux D, Rival AX (2005) The astreé analyzer. In: Proceedings of the European Symposium on Programming (ESOP)

[29]

Panichella S and Zaugg N An empirical investigation of relevant changes and automation needs in modern code review Empir Softw Eng 2020 25 6 4833-4872

[30]

Panichella S, Arnaoudova V, Di Penta M, Antoniol G (2015) Would static analysis tools help developers with code reviews?. In: Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp 161–170

[31]

Pascarella L, Spadini D, Palomba F, Bruntink M, and Bacchelli A Information needs in contemporary code review PACMHCI 2(CSCW):135 2018 27 1-135

[32]

Phang K, Foster JS, Hicks MW, Sazawal V (2009) Triaging checklists: a substitute for a phd in static analysis. Evaluation and Usability of Programming Languages and Tools (PLATEAU)

[33]

Porter MAn algorithm for suffix strippingProgram1980143130-137https://doi.org/10.1108/eb046814

[34]

Querel LP, Rigby PC (2018) Warningsguru: integrating statistical bug models with static analysis to provide timely and specific bug warnings. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp 892–895

[35]

Řehůřek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks (ELRA), pp 45–50

[36]

Reiss S (2007) Automatic code stylizing. In: Proceedings of the International Conference on Automated Software Engineering (ASE), pp 74–83

[37]

Ribeiro A, Meirelles P, Lago N, Kon F (2019) Ranking warnings from multiple source code static analyzers via ensemble learning. In: Proceedings of the 15th International Symposium on Open Collaboration, ACM, p 5

[38]

Rigby PC, German DM, Storey MA (2008) Open source software peer review practices: A case study of the apache server. In: Proceedings of the 30th International Conference on Software Engineering

[39]

Ruthruff JR, Penix J, Morgenthaler JD, Elbaum S, Rothermel G (2008) Predicting accurate and actionable static analysis warnings: an experimental approach. In: Proceedings of the International Conference on Software Engineering (ICSE), pp 341–350

[40]

Salton GM, Wong A, Yang C (1975) A vector space model for automatic indexing

[41]

Sokolova M, Guy L (2009) A systematic analysis of performance measures for classification tasks. Information Processing & Management 427–437

[42]

Spacco J, Hovemeyer D, Pugh W (2006) Tracking defect warnings across versions. In: Proceedings of the 2006 international workshop on Mining software repositories, ACM, pp 133–136

[43]

Vassallo C, Panichella S, Palomba F, Proksch S, Zaidman A, Gall HC (2018) Context is king: The developer perspective on the usage of static analysis tools. In: Proceedings of the International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 38–49

[44]

Wedyan F, Alrmuny D, Bieman JM (2009) The effectiveness of automated static analysis tools for fault detection and refactoring prediction. In: Second International Conference on Software Testing Verification and Validation, ICST 2009, Denver, Colorado, USA, April 1-4, 2009, IEEE Computer Society, pp 141–150

[45]

Weißgerber P, Neu D, Diehl S (2008) Small patches get in!. In: Proceedings of the 2008 International Working Conference on Mining Software Repositories

[46]

Williams CC, Hollingsworth JK (2005) Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering

[47]

Yang Y An evaluation of statistical approaches to text categorization Inf Retr 1999 1 1 69-90

[48]

Yoon J, Jin M, Jung Y (2014) Reducing false alarms from an industrial-strength static analyzer by SVM. In: 21St asia-pacific software engineering conference, APSEC 2014, jeju, south korea, december 1–4, 2014 2 Industry, Short, and QuASoQ Papers, pp 3–6

[49]

Yüksel U, Sözer H (2013) Automated classification of static code analysis alerts: a case study. In: Proceedings of the International Conference on Software Maintenance

[50]

Zampetti F, Scalabrino S, Oliveto R, Canfora G, Di Penta M (2017) How open source projects use static code analysis tools in continuous integration pipelines. In: Proceedings of the 14th International Conference on Mining Software Repositories, MSR 2017, Buenos Aires, Argentina, May 20-28, 2017, pp 334–344

[51]

Zampetti F, Mudbhari S, Arnaoudova V, Di Penta M, Panichella S, Antoniol G (2020).

[52]

Zhang D, Jin YGD, Zhang H (2013) Diagnosis-oriented alarm correlations. In: Asia-Pacific Software Engineering Conference (APSEC)

[53]

Zheng J, Williams L, Nagappan N, Snipes W, Hudepohl JP, and Vouk MA On the value of static analysis for fault detection in software IEEE Transactions on Software Engineering (TSE) 2006 32 4 240-253

Cited By

Yang LXu JZhang YZhang HBacchelli AChandra SBlincoe KTonella P(2023)EvaCRC: Evaluating Code Review CommentsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616245(275-287)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616245
Yu BZhong ZLi JYang YHe SHe PJust RFraser G(2023)ROME: Testing Image Captioning Systems via Recursive Object MeltingProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598094(766-778)Online publication date: 12-Jul-2023
https://dl.acm.org/doi/10.1145/3597926.3598094
Yu PWu YPeng XPeng JZhang JXie PZhao WGrundy JPollock LPenta M(2023)ViolationTracker: Building Precise Histories for Static Analysis ViolationsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00171(2022-2034)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00171
Show More Cited By

Index Terms

Using code reviews to automatically configure static analysis tools
1. General and reference
2. Social and professional topics
  1. Professional topics

Index terms have been assigned to the content through auto-classification.

Recommendations

On the adequacy of static analysis warnings with respect to code smell prediction
Abstract
Code smells are poor implementation choices that developers apply while evolving source code and that affect program maintainability. Multiple automated code smell detectors have been proposed: while most of them relied on heuristics applied over ...
Code reviews do not find bugs: how the current code review best practice slows us down
ICSE '15: Proceedings of the 37th International Conference on Software Engineering - Volume 2

Because of its many uses and benefits, code reviews are a standard part of the modern software engineering workflow. Since they require involvement of people, code reviewing is often the longest part of the code integration activities. Using experience ...
Adopting Code Reviews for Agile Software Development
AGILE '10: Proceedings of the 2010 Agile Conference

Code reviews have many benefits, most importantly to find bugs early in the development phase and to enforce coding standards. Still, it is widely accepted that formal code reviews are time-consuming and the practical applicability in agile development ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Empirical Software Engineering

Empirical Software Engineering Volume 27, Issue 1

Jan 2022

985 pages

ISSN:1382-3256

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 January 2022

Accepted: 28 October 2021

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yang LXu JZhang YZhang HBacchelli AChandra SBlincoe KTonella P(2023)EvaCRC: Evaluating Code Review CommentsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616245(275-287)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616245
Yu BZhong ZLi JYang YHe SHe PJust RFraser G(2023)ROME: Testing Image Captioning Systems via Recursive Object MeltingProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598094(766-778)Online publication date: 12-Jul-2023
https://dl.acm.org/doi/10.1145/3597926.3598094
Yu PWu YPeng XPeng JZhang JXie PZhao WGrundy JPollock LPenta M(2023)ViolationTracker: Building Precise Histories for Static Analysis ViolationsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00171(2022-2034)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00171
Zampetti FTamburri DPanichella SPanichella ACanfora GDi Penta M(2022)Continuous Integration and Delivery Practices for Cyber-Physical Systems: An Interview-Based StudyACM Transactions on Software Engineering and Methodology10.1145/357185432:3(1-44)Online publication date: 19-Nov-2022
https://dl.acm.org/doi/10.1145/3571854
Liu ZDwyer M(2022)WoodpeckerProceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings10.1145/3510454.3522681(334-336)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510454.3522681

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents