research-article

Predicting pull request completion time: a case study on large scale cloud services

Authors:

Chandra Maddila,

Nachiappan NagappanAuthors Info & Claims

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 874 - 882

https://doi.org/10.1145/3338906.3340457

Published: 12 August 2019 Publication History

Abstract

Effort estimation models have been long studied in software engineering research. Effort estimation models help organizations and individuals plan and track progress of their software projects and individual tasks to help plan delivery milestones better. Towards this end, there is a large body of work that has been done on effort estimation for projects but little work on an individual checkin (Pull Request) level. In this paper we present a methodology that provides effort estimates on individual developer check-ins which is displayed to developers to help them track their work items. Given the cloud development infrastructure pervasive in companies, it has enabled us to deploy our Pull Request Lifetime prediction system to several thousand developers across multiple software families. We observe from our deployment that the pull request lifetime prediction system conservatively helps save 44.61% of the developer time by accelerating Pull Requests to completion.

References

[1]

Azure Batch. https://azure.microsoft.com/en-us/services/batch/.

[2]

Azure DevOps REST API. https://docs.microsoft.com/en-us/rest/api/azure/ devops/?view=azure-devops-rest-5.0.

[3]

GitHub. https://github.com/about.

[4]

V. R. Basili, L. C. Briand, and W. L. Melo. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng., 22(10):751–761, Oct. 1996.

Digital Library

[5]

N. Bettenburg, M. Nagappan, and A. E. Hassan. Towards improving statistical modeling of software engineering data: Think locally, act globally! Empirical Softw. Engg., 20(2):294–335, Apr. 2015.

Digital Library

[6]

B. Boehm, B. Clark, E. Horowitz, J. Westland, R. Madachy, and R. Selby. Cost models for future software life cycle processes: Cocomo 2.0. Annals of Software Engineering, 1:57–94, 12 1995.

[7]

B. W. Boehm. Software engineering economics. IEEE Trans. Softw. Eng., 10(1):4–21, Jan. 1984.

Digital Library

[8]

L. C. Briand, K. El Emam, D. Surmann, I. Wieczorek, and K. D. Maxwell. An assessment and comparison of common software cost estimation modeling techniques. In Proceedings of the 21st International Conference on Software Engineering, ICSE ’99, pages 313–322, New York, NY, USA, 1999. ACM.

Digital Library

[9]

L. C. Briand, J. Wüst, J. W. Daly, and D. V. Porter. Exploring the relationship between design measures and software quality in object-oriented systems. J. Syst. Softw., 51(3):245–273, May 2000.

Digital Library

[10]

L. C. Briand, J. Wust, S. V. Ikonomovski, and H. Lounis. Investigating quality factors in object-oriented designs: an industrial case study. In Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002), pages 345–354, May 1999.

Digital Library

[11]

S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design. IEEE Trans. Softw. Eng., 20(6):476–493, June 1994.

Digital Library

[12]

S. Chulani, B. Boehm, and B. Steece. Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng., 25(4):573–583, July 1999.

Digital Library

[13]

K. El Emam, S. Benlarbi, N. Goel, and S. N. Rai. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Eng., 27(7):630–650, July 2001.

Digital Library

[14]

L. Layman, G. Kudrjavets, and N. Nagappan. Iterative identification of fault-prone binaries using in-process metrics. In ESEM, 2008.

Digital Library

[15]

L. Layman, N. Nagappan, S. Guckenheimer, J. Beehler, and A. Begel. Mining software effort data: preliminary analysis of visual studio team system data. pages 43–46, 01 2008.

Digital Library

[16]

L. MacLeod, M. Greiler, M. Storey, C. Bird, and J. Czerwonka. Code reviewing in the trenches: Challenges and best practices. IEEE Software, 35(4):34–42, July 2018.

[17]

T. Menzies, A. Butcher, D. Cok, A. Marcus, L. Layman, F. Shull, B. Turhan, and T. Zimmermann. Local versus global lessons for defect prediction and effort estimation. IEEE Transactions on Software Engineering, 39(6):822–834, June 2013.

Digital Library

[18]

A. Mockus, P. Zhang, and P. L. Li. Predictors of customer perceived software quality. In Proceedings of the 27th International Conference on Software Engineering, ICSE ’05, pages 225–233, New York, NY, USA, 2005. ACM.

Digital Library

[19]

T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Where the bugs are. In Proceedings of the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA ’04, pages 86–96, New York, NY, USA, 2004. ACM.

Digital Library

[20]

A. Rastogi, N. Nagappan, G. Gousios, and A. van der Hoek. Relationship between geographical location and evaluation of developer contributions in github. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’18, pages 22:1–22:8, New York, NY, USA, 2018. ACM.

Digital Library

[21]

D. M. Soares, M. L. de Lima Júnior, L. Murta, and A. Plastino. Acceptance factors of pull requests in open-source projects. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC ’15, pages 1541–1546, New York, NY, USA, 2015. ACM.

Digital Library

[22]

R. Subramanyam and M. S. Krishnan. Empirical analysis of ck metrics for objectoriented design complexity: implications for software defects. IEEE Transactions on Software Engineering, 29(4):297–310, April 2003.

Digital Library

[23]

M.-H. Tang, M.-H. Kao, and M.-H. Chen. An empirical study on object-oriented metrics. In Proceedings of the 6th International Symposium on Software Metrics, METRICS ’99, pages 242–, Washington, DC, USA, 1999. IEEE Computer Society.

Digital Library

[24]

J. Terrell, A. Kofink, J. Middleton, C. Rainear, E. Murphy-Hill, C. Parnin, and J. Stallings. Gender differences and bias in open source: pull request acceptance of women versus men. PeerJ Computer Science, 3:e111, May 2017.

[25]

J. Tsay, L. Dabbish, and J. Herbsleb. Influence of social and technical factors for evaluating contribution in github. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 356–366, New York, NY, USA, 2014. ACM.

Digital Library

[26]

M. A. Vouk and K. C. Tai. Some issues in multi-phase software reliability modeling. In Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research: Software Engineering - Volume 1, CASCON ’93, pages 513–523. IBM Press, 1993.

Digital Library

[27]

Y. Yu, H. Wang, V. Filkov, P. Devanbu, and B. Vasilescu. Wait for it: Determinants of pull request evaluation latency on github. In Proceedings of the 12th Working Conference on Mining Software Repositories, MSR ’15, pages 367–371, Piscataway, NJ, USA, 2015. IEEE Press.

Digital Library

Cited By

Joshi RKahani N(2024)Comparative Study of Reinforcement Learning in GitHub Pull Request Outcome Predictions2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00057(489-500)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00057
Moreno-Lumbreras DRobles GIzquierdo-Cortázar DGonzalez-Barahona J(2024)Software development metrics: to VR or not to VREmpirical Software Engineering10.1007/s10664-023-10435-329:2Online publication date: 3-Feb-2024
https://doi.org/10.1007/s10664-023-10435-3
Kula EGreuter Evan Deursen AGousios GChandra SBlincoe KTonella P(2023)Dynamic Prediction of Delays in Software Projects using Delay Patterns and Bayesian ModelingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616328(1012-1023)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616328
Show More Cited By

Index Terms

Predicting pull request completion time: a case study on large scale cloud services
1. Software and its engineering
  1. Software creation and management
    1. Software development process management
      1. Software development methods

Recommendations

Adjustment Factor for Use Case Point Software Effort Estimation (Study Case: Student Desk Portal)
Abstract
With the growth of technology, the requirement for customized software to support business increases and the experts on software development also increases. The more Software Developers means the more competition in Software Development. Software ...
Effort estimation in large-scale software development: An industrial case study
Abstract
Context: Software projects frequently incur schedule and budget overruns. Planning and estimation are particularly challenging in large and globally distributed agile projects. While software engineering researchers have ...
Predicting effort for requirement changes during software development
SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology

In any software development life cycle, requirement and software changes are inevitable. One of the factors that influences the effectiveness of the change acceptance decision is the accuracy of the effort prediction for requirement changes. There are ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

August 2019

1264 pages

ISBN:9781450355728

DOI:10.1145/3338906

General Chairs:
Marlon Dumas
University of Tartu, Estonia
,
Dietmar Pfahl
University of Tartu, Estonia
,
Program Chairs:
Sven Apel
Saarland University, Germany
,
Alessandra Russo
Imperial College, UK

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ESEC/FSE '19

Sponsor:

SIGSOFT

ESEC/FSE '19: 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

August 26 - 30, 2019

Tallinn, Estonia

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
667
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)5

Reflects downloads up to 04 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Joshi RKahani N(2024)Comparative Study of Reinforcement Learning in GitHub Pull Request Outcome Predictions2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00057(489-500)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00057
Moreno-Lumbreras DRobles GIzquierdo-Cortázar DGonzalez-Barahona J(2024)Software development metrics: to VR or not to VREmpirical Software Engineering10.1007/s10664-023-10435-329:2Online publication date: 3-Feb-2024
https://doi.org/10.1007/s10664-023-10435-3
Kula EGreuter Evan Deursen AGousios GChandra SBlincoe KTonella P(2023)Dynamic Prediction of Delays in Software Projects using Delay Patterns and Bayesian ModelingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616328(1012-1023)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616328
Maddila CUpadrasta SBansal CNagappan NGousios Gvan Deursen A(2023)Nudge: Accelerating Overdue Pull Requests toward CompletionACM Transactions on Software Engineering and Methodology10.1145/354479132:2(1-30)Online publication date: 30-Mar-2023
https://dl.acm.org/doi/10.1145/3544791
Moreno-Lumbreras DGonzález-Barahona JLanza M(2023)Understanding the NPM Dependencies Ecosystem of a Project Using Virtual Reality2023 IEEE Working Conference on Software Visualization (VISSOFT)10.1109/VISSOFT60811.2023.00019(84-94)Online publication date: 1-Oct-2023
https://doi.org/10.1109/VISSOFT60811.2023.00019
Li ZYu YWang TLei YWang YWang H(2023) To Follow or Not to Follow: Understanding Issue/Pull-Request Templates on GitHub IEEE Transactions on Software Engineering10.1109/TSE.2022.322405349:4(2530-2544)Online publication date: 1-Apr-2023
https://doi.org/10.1109/TSE.2022.3224053
Zhang XYu YGousios GRastogi A(2023)Pull Request Decisions Explained: An Empirical OverviewIEEE Transactions on Software Engineering10.1109/TSE.2022.316505649:2(849-871)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TSE.2022.3165056
Yang LLiu BJia JXue JXu JBacchelli AZhang H(2023)Evaluating Learning-to-Rank Models for Prioritizing Code Review Requests using Process Simulation2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00050(461-472)Online publication date: Mar-2023
https://doi.org/10.1109/SANER56733.2023.00050
Morovati MNikanjam ATambon FKhomh FJiang Z(2023)Bug characterization in machine learning-based systemsEmpirical Software Engineering10.1007/s10664-023-10400-029:1Online publication date: 5-Dec-2023
https://doi.org/10.1007/s10664-023-10400-0
Wang DXiao TSon TKula RIshio TKamei YMatsumoto K(2023)More than React: Investigating the Role of Emoji Reaction in GitHub Pull RequestsEmpirical Software Engineering10.1007/s10664-023-10336-528:5Online publication date: 18-Sep-2023
https://doi.org/10.1007/s10664-023-10336-5
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents