skip to main content
10.1145/3510457.3513039acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

How does code reviewing feedback evolve?: a longitudinal study at Dell EMC

Published: 17 October 2022 Publication History

Abstract

Code review is an integral part of modern software development, where fellow developers critique the content, premise, and structure of code changes. Organizations like DellEMC have made considerable investment in code reviews, yet tracking the characteristics of feedback that code reviews provide (a primary product of the code reviewing process) is still a difficult process. To understand community and personal feedback trends, we perform a longitudinal study of 39,249 reviews that contain 248,695 review comments from a proprietary project that is developed by DellEMC. To investigate generalizability, we replicate our study on the OpenStackN ova project. Through an analysis guided by topic models, we observe that more context-specific, technical feedback is introduced as the studied projects and communities age and as the reviewers within those communities accrue experience. This suggests that communities are reaping a larger return on investment in code review as they grow accustomed to the practice and as reviewers hone their skills. The code review trends uncovered by our models present opportunities for enterprises to monitor reviewing tendencies and improve knowledge transfer and reviewer skills.

References

[1]
Amritanshu Agrawal, Wei Fu, and Tim Menzies. 2018. What is wrong with topic modeling? And how to fix it using search-based software engineering. Information and Software Technology (2018).
[2]
Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and challenges of modern code review. In Proceedings of the International Conference on Software Engineering (ICSE). IEEE, 712--721.
[3]
Alberto Bacchelli, Michele Lanza, and Romain Robbes. 2010. Linking e-mails and source code artifacts. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 375--384.
[4]
Anton Barua, Stephen W Thomas, and Ahmed E Hassan. 2014. What are developers talking about? an analysis of topics and trends in stack overflow. Empirical Software Engineering 19, 3 (2014), 619--654.
[5]
Olga Baysal, Oleksii Kononenko, Reid Holmes, and Michael W Godfrey. 2013. The influence of non-technical factors on code review. In Proceedings of the Working Conference on Reverse Engineering (WCRE). IEEE, 122--131.
[6]
Moritz Beller, Alberto Bacchelli, Andy Zaidman, and Elmar Juergens. 2014. Modern code reviews in open-source projects: Which problems do they fix?. In Proceedings of the Working Conference on Mining Software Repositories (MSR). ACM, 202--211.
[7]
Christian Bird, Alex Gourley, and Prem Devanbu. 2007. Detecting patch submission and acceptance in oss projects. In Proceedings of the International Workshop on Mining Software Repositories (MSR). IEEE, 26--26.
[8]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, Jan (2003), 993--1022.
[9]
Amiangshu Bosu, Michaela Greiler, and Christian Bird. 2015. Characteristics of useful code reviews: An empirical study at microsoft. In Proceedings of the Working Conference on Mining Software Repositories (MSR). IEEE, 146--156.
[10]
Jonathan Chang, Jordan L Boyd-Graber, Sean Gerrish, Chong Wang, and David M Blei. 2009. Reading tea leaves: How humans interpret topic models. In Nips, Vol. 31. 1--9.
[11]
Tse-Hsun Chen, Stephen W Thomas, and Ahmed E Hassan. 2016. A survey on the use of topic models when mining software repositories. Empirical Software Engineering 21, 5 (2016), 1843--1919.
[12]
David Roxbee Cox and Alan Stuart. 1955. Some quick sign tests for trend in location and dispersion. Biometrika 42, 1/2 (1955), 80--95.
[13]
Marco di Biase, Magiel Bruntink, and Alberto Bacchelli. 2016. A security perspective on code review: The case of Chromium. In Proceedings of the International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 21--30.
[14]
Michael E Fagan. 2001. Design and code inspections to reduce errors in program development. In Pioneers and Their Contributions to Software Engineering. Springer, 301--334.
[15]
Daniel M German, Gregorio Robles, Germán Poo-Caamaño, Xin Yang, Hajimu Iida, and Katsuro Inoue. 2018. Was my contribution fairly reviewed? A framework to study fairness in Modern Code Reviews. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 523--534.
[16]
Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar, Stefano Maffulli, and Gregorio Robles. 2013. Understanding How Companies Interact with Free Software Communities. IEEE Software 30, 5 (2013), 38--45.
[17]
Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar, Gregorio Robles, and Alvaro del Castillo. 2014. Analyzing Gerrit Code Review Parameters with Bicho. In Proceedings of the International Workshop on Software Quality and Maintainability (SQM). 1--12.
[18]
Kazuki Hamasaki, Raula Gaikovina Kula, Norihiro Yoshida, Ana Erika Camargo Cruz, Kenji Fujiwara, and Hajima Iida. 2013. Who Does What during a Code Review? Datasets of OSS Peer Review Repositories. In Proceedings of the Working Conference on Mining Software Repositories (MSR). 49--52.
[19]
Abram Hindle, Christian Bird, Thomas Zimmermann, and Nachiappan Nagappan. 2015. Do topics make sense to managers and developers? Empirical Software Engineering 20, 2 (2015), 479--515.
[20]
Abram Hindle, Michael W Godfrey, and Richard C Holt. 2009. What's hot and what's not: Windowed developer topic analysis. In Proc. of the Int'l Conference on Software Maintenance (ICSM). IEEE, 339--348.
[21]
Dan Jurafsky. 2000. Speech & language processing. Pearson Education.
[22]
Nafiseh Kahani, Mojtaba Bagherzadeh, Juergen Dingel, and James R Cordy. 2016. The problems with eclipse modeling tools: a topic analysis of eclipse forums. In Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems. ACM, 227--237.
[23]
Oleksii Kononenko, Olga Baysal, and Michael W Godfrey. 2016. Code review quality: how developers see it. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 1028--1038.
[24]
Oleksii Kononenko, Olga Baysal, Latifa Guerrouj, Yaxin Cao, and Michael W Godfrey. 2015. Investigating code review quality: Do people and participation matter?. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME). IEEE, 111--120.
[25]
Victor Lavrenko and W Bruce Croft. 2001. Relevance based language models. In Proceedings of the International Conference on Research and Development in Information Retrieval. ACM, 120--127.
[26]
Erik Linstead, Cristina Lopes, and Pierre Baldi. 2008. An application of latent Dirichlet allocation to analyzing software evolution. In Proceedings of the International Conference on Machine Learning and Applications (ICMLA). IEEE, 813--818.
[27]
Mika V Mäntylä, Fabio Calefato, and Maelick Claes. 2018. Natural Language or Not (NLoN) - A Package for Software Engineering Text Analysis Pipeline. In Proc. of the 15th International Conf. on Mining Software Repositories (MSR). to appear.
[28]
Mika V Mäntylä and Casper Lassenius. 2009. What types of defects are really discovered in code reviews? Transactions on Software Engineering (TSE) 35, 3 (2009), 430--448.
[29]
Girish Maskeri, Santonu Sarkar, and Kenneth Heafield. 2008. Mining business topics in source code using latent dirichlet allocation. In Proceedings of the India Software Engineering Conference. ACM, 113--120.
[30]
Andrew Kachites McCallum. 2002. MALLET: A Machine Learning for Language Toolkit. (2002). http://mallet.cs.umass.edu.
[31]
Shane McIntosh, Yasutaka Kamei, Bram Adams, and Ahmed E Hassan. 2014. The impact of code review coverage and code review participation on software quality: A case study of the qt, vtk, and itk projects. In Proceedings of the Working Conference on Mining Software Repositories (MSR). ACM, 192--201.
[32]
Shane McIntosh, Yasutaka Kamei, Bram Adams, and Ahmed E Hassan. 2016. An empirical study of the impact of modern code review practices on software quality. Empirical Software Engineering 21, 5 (2016), 2146--2189.
[33]
Audris Mockus and James D Herbsleb. 2002. Expertise browser: a quantitative approach to identifying expertise. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 503--512.
[34]
Murtuza Mukadam, Christian Bird, and Peter C. Rigby. 2013. Gerrit Software Code Review Data from Android. In Proc. of the 10th Working Conf. on Mining Software Repositories (MSR). 45--48.
[35]
Takuto Norikane, Akinori Ihara, and Kenichi Matsumoto. 2017. Which review feedback did long-term contributors get on OSS projects?. In Proceedings of the International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 571--572.
[36]
Thai Pangsakulyanont, Patanamon Thongtanunam, Daniel Port, and Hajimu Iida. 2014. Assessing MCR discussion usefulness using semantic similarity. In Proceedings of the International Workshop on Empirical Software Engineering in Practice (IWESEP). IEEE, 49--54.
[37]
Martin F Porter. 1980. An algorithm for suffix stripping. Program 14, 3 (1980), 130--137.
[38]
Mohammad Masudur Rahman, Chanchal K Roy, and Raula G Kula. 2017. Predicting usefulness of code review comments using textual features and developer experience. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 215--226.
[39]
Peter C Rigby and Christian Bird. 2013. Convergent contemporary software peer review practices. In Proceedings of the Joint Meeting on Foundations of Software Engineering (FSE). ACM, 202--212.
[40]
Peter C Rigby, Daniel M German, Laura Cowen, and Margaret-Anne Storey. 2014. Peer review on open-source software projects: Parameters, statistical models, and theory. Transactions on Software Engineering and Methodology (TOSEM) 23, 4 (2014), 35.
[41]
Peter C Rigby and Margaret-Anne Storey. 2011. Understanding broadcast based peer review on open source software projects. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 541--550.
[42]
Christoffer Rosen and Emad Shihab. 2016. What are mobile developers asking about? a large scale study using stack overflow. Empirical Software Engineering 21, 3 (2016), 1192--1223.
[43]
Yee Whye Teh, Michael I Jordan, Matthew J Beal, and David M Blei. 2004. Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes. In Nips. 1385--1392.
[44]
Patanamon Thongtanunam, Shane McIntosh, Ahmed E. Hassan, and Hajimu Iida. 2016. Revisiting Code Ownership and Its Relationship with Software Quality in the Scope of Modern Code Review. In Proceedings of the International Conference on Software Engineering (ICSE). 1039--1050.
[45]
Patanamon Thongtanunam, Chakkrit Tantithamthavorn, Raula Gaikovina Kula, Norihiro Yoshida, Hajimu Iida, and Ken-ichi Matsumoto. 2015. Who should review my code? A file location-based code-reviewer recommendation approach for modern code review. In Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 141--150.
[46]
Patanamon Thongtanunam, Xin Yang, Norihiro Yoshida, Raula Gaikovina Kula, Ana Erika Camargo Cruz, Kenji Fujiwara, and Hajimu Iida. 2014. ReDA: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME). 605--608.
[47]
Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Let's talk about it: evaluating contributions through discussion in GitHub. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE). ACM, 144--154.
[48]
Perry van Wesel, Bin Lin, Gregorio Robles, and Alexander Serebrenik. 2017. Reviewing Career Paths of the OpenStack Developers. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME). 543--548.
[49]
Xin Xia, David Lo, Xinyu Wang, and Bo Zhou. 2013. Tag recommendation in software information sites. In Proceedings of the Working Conference on Mining Software Repositories (MSR). IEEE, 287--296.
[50]
Weizhong Zhao, James J Chen, Roger Perkins, Zhichao Liu, Weigong Ge, Yijun Ding, and Wen Zou. 2015. A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinformatics 16, 13 (2015), S8.
[51]
Yu Zhao, Feng Zhang, Emad Shihab, Ying Zou, and Ahmed E Hassan. 2016. How Are Discussions Associated with Bug Reworking?: An Empirical Study on Open Source Projects. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement (ESEM). ACM, 21:1--21:10.
[52]
Jiaxin Zhu, Minghui Zhou, and Audris Mockus. 2016. Effectiveness of code contribution: from patch-based to pull-request-based tools. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE). ACM, 871--882.

Cited By

View all
  • (2023)On potential improvements in the analysis of the evolution of themes in code review comments2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00059(340-347)Online publication date: 6-Sep-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE-SEIP '22: Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice
May 2022
371 pages
ISBN:9781450392266
DOI:10.1145/3510457
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '22
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)2
Reflects downloads up to 24 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)On potential improvements in the analysis of the evolution of themes in code review comments2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00059(340-347)Online publication date: 6-Sep-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media