research-article

How does code reviewing feedback evolve?: a longitudinal study at Dell EMC

Authors:

Maxime Lamothe,

Shane McIntoshAuthors Info & Claims

ICSE-SEIP '22: Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice

Pages 151 - 160

https://doi.org/10.1145/3510457.3513039

Published: 17 October 2022 Publication History

Abstract

Code review is an integral part of modern software development, where fellow developers critique the content, premise, and structure of code changes. Organizations like DellEMC have made considerable investment in code reviews, yet tracking the characteristics of feedback that code reviews provide (a primary product of the code reviewing process) is still a difficult process. To understand community and personal feedback trends, we perform a longitudinal study of 39,249 reviews that contain 248,695 review comments from a proprietary project that is developed by DellEMC. To investigate generalizability, we replicate our study on the OpenStackN ova project. Through an analysis guided by topic models, we observe that more context-specific, technical feedback is introduced as the studied projects and communities age and as the reviewers within those communities accrue experience. This suggests that communities are reaping a larger return on investment in code review as they grow accustomed to the practice and as reviewers hone their skills. The code review trends uncovered by our models present opportunities for enterprises to monitor reviewing tendencies and improve knowledge transfer and reviewer skills.

References

[1]

Amritanshu Agrawal, Wei Fu, and Tim Menzies. 2018. What is wrong with topic modeling? And how to fix it using search-based software engineering. Information and Software Technology (2018).

[2]

Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and challenges of modern code review. In Proceedings of the International Conference on Software Engineering (ICSE). IEEE, 712--721.

[3]

Alberto Bacchelli, Michele Lanza, and Romain Robbes. 2010. Linking e-mails and source code artifacts. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 375--384.

Digital Library

[4]

Anton Barua, Stephen W Thomas, and Ahmed E Hassan. 2014. What are developers talking about? an analysis of topics and trends in stack overflow. Empirical Software Engineering 19, 3 (2014), 619--654.

Digital Library

[5]

Olga Baysal, Oleksii Kononenko, Reid Holmes, and Michael W Godfrey. 2013. The influence of non-technical factors on code review. In Proceedings of the Working Conference on Reverse Engineering (WCRE). IEEE, 122--131.

[6]

Moritz Beller, Alberto Bacchelli, Andy Zaidman, and Elmar Juergens. 2014. Modern code reviews in open-source projects: Which problems do they fix?. In Proceedings of the Working Conference on Mining Software Repositories (MSR). ACM, 202--211.

Digital Library

[7]

Christian Bird, Alex Gourley, and Prem Devanbu. 2007. Detecting patch submission and acceptance in oss projects. In Proceedings of the International Workshop on Mining Software Repositories (MSR). IEEE, 26--26.

Digital Library

[8]

David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, Jan (2003), 993--1022.

Digital Library

[9]

Amiangshu Bosu, Michaela Greiler, and Christian Bird. 2015. Characteristics of useful code reviews: An empirical study at microsoft. In Proceedings of the Working Conference on Mining Software Repositories (MSR). IEEE, 146--156.

[10]

Jonathan Chang, Jordan L Boyd-Graber, Sean Gerrish, Chong Wang, and David M Blei. 2009. Reading tea leaves: How humans interpret topic models. In Nips, Vol. 31. 1--9.

Digital Library

[11]

Tse-Hsun Chen, Stephen W Thomas, and Ahmed E Hassan. 2016. A survey on the use of topic models when mining software repositories. Empirical Software Engineering 21, 5 (2016), 1843--1919.

Digital Library

[12]

David Roxbee Cox and Alan Stuart. 1955. Some quick sign tests for trend in location and dispersion. Biometrika 42, 1/2 (1955), 80--95.

[13]

Marco di Biase, Magiel Bruntink, and Alberto Bacchelli. 2016. A security perspective on code review: The case of Chromium. In Proceedings of the International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 21--30.

[14]

Michael E Fagan. 2001. Design and code inspections to reduce errors in program development. In Pioneers and Their Contributions to Software Engineering. Springer, 301--334.

[15]

Daniel M German, Gregorio Robles, Germán Poo-Caamaño, Xin Yang, Hajimu Iida, and Katsuro Inoue. 2018. Was my contribution fairly reviewed? A framework to study fairness in Modern Code Reviews. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 523--534.

Digital Library

[16]

Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar, Stefano Maffulli, and Gregorio Robles. 2013. Understanding How Companies Interact with Free Software Communities. IEEE Software 30, 5 (2013), 38--45.

Digital Library

[17]

Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar, Gregorio Robles, and Alvaro del Castillo. 2014. Analyzing Gerrit Code Review Parameters with Bicho. In Proceedings of the International Workshop on Software Quality and Maintainability (SQM). 1--12.

[18]

Kazuki Hamasaki, Raula Gaikovina Kula, Norihiro Yoshida, Ana Erika Camargo Cruz, Kenji Fujiwara, and Hajima Iida. 2013. Who Does What during a Code Review? Datasets of OSS Peer Review Repositories. In Proceedings of the Working Conference on Mining Software Repositories (MSR). 49--52.

[19]

Abram Hindle, Christian Bird, Thomas Zimmermann, and Nachiappan Nagappan. 2015. Do topics make sense to managers and developers? Empirical Software Engineering 20, 2 (2015), 479--515.

Digital Library

[20]

Abram Hindle, Michael W Godfrey, and Richard C Holt. 2009. What's hot and what's not: Windowed developer topic analysis. In Proc. of the Int'l Conference on Software Maintenance (ICSM). IEEE, 339--348.

[21]

Dan Jurafsky. 2000. Speech & language processing. Pearson Education.

[22]

Nafiseh Kahani, Mojtaba Bagherzadeh, Juergen Dingel, and James R Cordy. 2016. The problems with eclipse modeling tools: a topic analysis of eclipse forums. In Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems. ACM, 227--237.

Digital Library

[23]

Oleksii Kononenko, Olga Baysal, and Michael W Godfrey. 2016. Code review quality: how developers see it. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 1028--1038.

Digital Library

[24]

Oleksii Kononenko, Olga Baysal, Latifa Guerrouj, Yaxin Cao, and Michael W Godfrey. 2015. Investigating code review quality: Do people and participation matter?. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME). IEEE, 111--120.

Digital Library

[25]

Victor Lavrenko and W Bruce Croft. 2001. Relevance based language models. In Proceedings of the International Conference on Research and Development in Information Retrieval. ACM, 120--127.

Digital Library

[26]

Erik Linstead, Cristina Lopes, and Pierre Baldi. 2008. An application of latent Dirichlet allocation to analyzing software evolution. In Proceedings of the International Conference on Machine Learning and Applications (ICMLA). IEEE, 813--818.

Digital Library

[27]

Mika V Mäntylä, Fabio Calefato, and Maelick Claes. 2018. Natural Language or Not (NLoN) - A Package for Software Engineering Text Analysis Pipeline. In Proc. of the 15th International Conf. on Mining Software Repositories (MSR). to appear.

Digital Library

[28]

Mika V Mäntylä and Casper Lassenius. 2009. What types of defects are really discovered in code reviews? Transactions on Software Engineering (TSE) 35, 3 (2009), 430--448.

Digital Library

[29]

Girish Maskeri, Santonu Sarkar, and Kenneth Heafield. 2008. Mining business topics in source code using latent dirichlet allocation. In Proceedings of the India Software Engineering Conference. ACM, 113--120.

Digital Library

[30]

Andrew Kachites McCallum. 2002. MALLET: A Machine Learning for Language Toolkit. (2002). http://mallet.cs.umass.edu.

[31]

Shane McIntosh, Yasutaka Kamei, Bram Adams, and Ahmed E Hassan. 2014. The impact of code review coverage and code review participation on software quality: A case study of the qt, vtk, and itk projects. In Proceedings of the Working Conference on Mining Software Repositories (MSR). ACM, 192--201.

Digital Library

[32]

Shane McIntosh, Yasutaka Kamei, Bram Adams, and Ahmed E Hassan. 2016. An empirical study of the impact of modern code review practices on software quality. Empirical Software Engineering 21, 5 (2016), 2146--2189.

Digital Library

[33]

Audris Mockus and James D Herbsleb. 2002. Expertise browser: a quantitative approach to identifying expertise. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 503--512.

Digital Library

[34]

Murtuza Mukadam, Christian Bird, and Peter C. Rigby. 2013. Gerrit Software Code Review Data from Android. In Proc. of the 10th Working Conf. on Mining Software Repositories (MSR). 45--48.

[35]

Takuto Norikane, Akinori Ihara, and Kenichi Matsumoto. 2017. Which review feedback did long-term contributors get on OSS projects?. In Proceedings of the International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 571--572.

[36]

Thai Pangsakulyanont, Patanamon Thongtanunam, Daniel Port, and Hajimu Iida. 2014. Assessing MCR discussion usefulness using semantic similarity. In Proceedings of the International Workshop on Empirical Software Engineering in Practice (IWESEP). IEEE, 49--54.

Digital Library

[37]

Martin F Porter. 1980. An algorithm for suffix stripping. Program 14, 3 (1980), 130--137.

[38]

Mohammad Masudur Rahman, Chanchal K Roy, and Raula G Kula. 2017. Predicting usefulness of code review comments using textual features and developer experience. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 215--226.

Digital Library

[39]

Peter C Rigby and Christian Bird. 2013. Convergent contemporary software peer review practices. In Proceedings of the Joint Meeting on Foundations of Software Engineering (FSE). ACM, 202--212.

Digital Library

[40]

Peter C Rigby, Daniel M German, Laura Cowen, and Margaret-Anne Storey. 2014. Peer review on open-source software projects: Parameters, statistical models, and theory. Transactions on Software Engineering and Methodology (TOSEM) 23, 4 (2014), 35.

Digital Library

[41]

Peter C Rigby and Margaret-Anne Storey. 2011. Understanding broadcast based peer review on open source software projects. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 541--550.

Digital Library

[42]

Christoffer Rosen and Emad Shihab. 2016. What are mobile developers asking about? a large scale study using stack overflow. Empirical Software Engineering 21, 3 (2016), 1192--1223.

Digital Library

[43]

Yee Whye Teh, Michael I Jordan, Matthew J Beal, and David M Blei. 2004. Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes. In Nips. 1385--1392.

[44]

Patanamon Thongtanunam, Shane McIntosh, Ahmed E. Hassan, and Hajimu Iida. 2016. Revisiting Code Ownership and Its Relationship with Software Quality in the Scope of Modern Code Review. In Proceedings of the International Conference on Software Engineering (ICSE). 1039--1050.

Digital Library

[45]

Patanamon Thongtanunam, Chakkrit Tantithamthavorn, Raula Gaikovina Kula, Norihiro Yoshida, Hajimu Iida, and Ken-ichi Matsumoto. 2015. Who should review my code? A file location-based code-reviewer recommendation approach for modern code review. In Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 141--150.

[46]

Patanamon Thongtanunam, Xin Yang, Norihiro Yoshida, Raula Gaikovina Kula, Ana Erika Camargo Cruz, Kenji Fujiwara, and Hajimu Iida. 2014. ReDA: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME). 605--608.

Digital Library

[47]

Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Let's talk about it: evaluating contributions through discussion in GitHub. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE). ACM, 144--154.

Digital Library

[48]

Perry van Wesel, Bin Lin, Gregorio Robles, and Alexander Serebrenik. 2017. Reviewing Career Paths of the OpenStack Developers. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME). 543--548.

[49]

Xin Xia, David Lo, Xinyu Wang, and Bo Zhou. 2013. Tag recommendation in software information sites. In Proceedings of the Working Conference on Mining Software Repositories (MSR). IEEE, 287--296.

[50]

Weizhong Zhao, James J Chen, Roger Perkins, Zhichao Liu, Weigong Ge, Yijun Ding, and Wen Zou. 2015. A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinformatics 16, 13 (2015), S8.

[51]

Yu Zhao, Feng Zhang, Emad Shihab, Ying Zou, and Ahmed E Hassan. 2016. How Are Discussions Associated with Bug Reworking?: An Empirical Study on Open Source Projects. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement (ESEM). ACM, 21:1--21:10.

Digital Library

[52]

Jiaxin Zhu, Minghui Zhou, and Audris Mockus. 2016. Effectiveness of code contribution: from patch-based to pull-request-based tools. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE). ACM, 871--882.

Digital Library

Cited By

Iftikhar UBörstler JBin Ali N(2023)On potential improvements in the analysis of the evolution of themes in code review comments2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00059(340-347)Online publication date: 6-Sep-2023
https://doi.org/10.1109/SEAA60479.2023.00059

Recommendations

Using developers' feedback to improve code smell detection
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

Several studies are focused on the study of code smells and many detection techniques have been proposed. In this scenario, the use of rules involving software-metrics has been widely used in refactoring tools as a mechanism to detect code smells ...
Do Code and Comments Co-Evolve? On the Relation between Source Code and Comment Changes
WCRE '07: Proceedings of the 14th Working Conference on Reverse Engineering

Comments are valuable especially for program under- standing and maintenance, but do developers comment their code? To which extent do they add comments or adapt them when they evolve the code? We examine the question whether source code and associated ...
An approach to safely evolve program families in C
SPLASH '14: Proceedings of the companion publication of the 2014 ACM SIGPLAN conference on Systems, Programming, and Applications: Software for Humanity

The C preprocessor is widely used to handle variability and solve portability issues in program families. In this context, developers normally use tools like GCC and Clang. However, these tools are not variability-aware, i.e., they preprocess the code ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE-SEIP '22: Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice

May 2022

371 pages

ISBN:9781450392266

DOI:10.1145/3510457

Conference Chairs:
Mark Harman
Facebook, Inc & University College London
,
Heather Miller
Carnegie Mellon University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

Mitacs

Conference

ICSE '22

Sponsor:

SIGSOFT

ICSE '22: 44th International Conference on Software Engineering

May 21 - 29, 2022

Pennsylvania, Pittsburgh

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
69
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)2

Reflects downloads up to 24 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Iftikhar UBörstler JBin Ali N(2023)On potential improvements in the analysis of the evolution of themes in code review comments2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00059(340-347)Online publication date: 6-Sep-2023
https://doi.org/10.1109/SEAA60479.2023.00059

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents