research-article

Open access

Productivity assessment of neural code completion

Authors:

Albert Ziegler,

Eirini Kalliamvakou,

Shawn Simister,

Ganesh Sittampalam, and

Edward AftandilianAuthors Info & Claims

MAPS 2022: Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming

June 2022

Pages 21 - 29

https://doi.org/10.1145/3520312.3534864

Published: 13 June 2022 Publication History

Abstract

Neural code synthesis has reached a point where snippet generation is accurate enough to be considered for integration into human software development workflows. Commercial products aim to increase programmers’ productivity, without being able to measure it directly. In this case study, we asked users of GitHub Copilot about its impact on their productivity, and sought to find a reflection of their perception in directly measurable user data. We find that the rate with which shown suggestions are accepted, rather than more specific metrics regarding the persistence of completions in the code over time, drives developers’ perception of productivity.

References

[1]

Sven Amann, Sebastian Proksch, Sarah Nadi, and Mira Mezini. 2016. A Study of Visual Studio Usage in Practice. In IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, SANER 2016, Suita, Osaka, Japan, March 14-18, 2016 - Volume 1. IEEE Computer Society, 124–134. https://doi.org/10.1109/SANER.2016.39

[2]

Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie J. Cai, Michael Terry, Quoc V. Le, and Charles Sutton. 2021. Program Synthesis with Large Language Models. CoRR, abs/2108.07732 (2021), arXiv:2108.07732. arxiv:2108.07732

[3]

Gareth Ari Aye, Seohyun Kim, and Hongyu Li. 2021. Learning Autocompletion from Real-World Datasets. In 43rd IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, ICSE (SEIP) 2021, Madrid, Spain, May 25-28, 2021. IEEE, 131–139. https://doi.org/10.1109/ICSE-SEIP52600.2021.00022

Digital Library

[4]

Moritz Beller, Vince Orgovan, Spencer Buja, and Thomas Zimmermann. 2020. Mind the gap: on the relationship between automatically measured and self-reported productivity. IEEE Software, 38, 5 (2020), 24–31.

[5]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. CoRR, abs/2107.03374 (2021), arXiv:2107.03374. arxiv:2107.03374

[6]

Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, and Jenna Butler. 2021. The SPACE of Developer Productivity: There’s more to it than you think. Queue, 19, 1 (2021), 20–48.

Digital Library

[7]

Vincent J. Hellendoorn, Sebastian Proksch, Harald C. Gall, and Alberto Bacchelli. 2019. When code completion fails: a case study on real-world completions. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019, Joanne M. Atlee, Tevfik Bultan, and Jon Whittle (Eds.). IEEE / ACM, 960–970. https://doi.org/10.1109/ICSE.2019.00101

Digital Library

[8]

Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, and Jacob Steinhardt. 2021. Measuring Coding Challenge Competence With APPS. CoRR, abs/2105.09938 (2021), arXiv:2105.09938. arxiv:2105.09938

[9]

Sumith Kulal, Panupong Pasupat, Kartik Chandra, Mina Lee, Oded Padon, Alex Aiken, and Percy Liang. 2019. SPoC: Search-based Pseudocode to Code. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 11883–11894. https://proceedings.neurips.cc/paper/2019/hash/7298332f04ac004a0ca44cc69ecf6f6b-Abstract.html

[10]

Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. 2019. What makes a good conversation? How controllable attributes affect human judgments. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 1702–1723. https://doi.org/10.18653/v1/n19-1170

[11]

Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, and Neel Sundaresan. 2020. IntelliCode compose: code generation using transformer. In ESEC/FSE ’20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Virtual Event, USA, November 8-13, 2020, Prem Devanbu, Myra B. Cohen, and Thomas Zimmermann (Eds.). ACM, 1433–1443. https://doi.org/10.1145/3368089.3417058

Digital Library

[12]

Alexey Svyatkovskiy, Sebastian Lee, Anna Hadjitofi, Maik Riechert, Juliana Vicente Franco, and Miltiadis Allamanis. 2021. Fast and Memory-Efficient Neural Code Completion. In 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021, Madrid, Spain, May 17-19, 2021. IEEE, 329–340. https://doi.org/10.1109/MSR52588.2021.00045

[13]

Priyan Vaithilingam, Tianyi Zhang, and Elena Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI ’22 Late-Breaking Work: Proceedings of the 2022 Conference on Human Factors in Computing Systems.

Digital Library

[14]

Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts (CHI EA ’22). Association for Computing Machinery, New York, NY, USA. Article 332, 7 pages. isbn:9781450391566 https://doi.org/10.1145/3491101.3519665

Digital Library

[15]

Chris van der Lee, Albert Gatt, Emiel van Miltenburg, Sander Wubben, and Emiel Krahmer. 2019. Best practices for the human evaluation of automatically generated text. In Proceedings of the 12th International Conference on Natural Language Generation, INLG 2019, Tokyo, Japan, October 29 - November 1, 2019, Kees van Deemter, Chenghua Lin, and Hiroya Takamura (Eds.). Association for Computational Linguistics, 355–368. https://doi.org/10.18653/v1/W19-8643

[16]

Justin D. Weisz, Michael J. Muller, Stephanie Houde, John T. Richards, Steven I. Ross, Fernando Martinez, Mayank Agarwal, and Kartik Talamadupula. 2021. Perfection Not Required? Human-AI Partnerships in Code Translation. In IUI ’21: 26th International Conference on Intelligent User Interfaces, College Station, TX, USA, April 13-17, 2021, Tracy Hammond, Katrien Verbert, Dennis Parra, Bart P. Knijnenburg, John O’Donovan, and Paul Teale (Eds.). ACM, 402–412. https://doi.org/10.1145/3397481.3450656

Digital Library

[17]

Svante Wold, Michael Sjöström, and Lennart Eriksson. 2001. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58, 2 (2001), 109–130. issn:0169-7439 https://doi.org/10.1016/S0169-7439(01)00155-1 PLS Methods

[18]

Frank F. Xu, Bogdan Vasilescu, and Graham Neubig. 2021. In-IDE Code Generation from Natural Language: Promise and Challenges. CoRR, abs/2101.11149 (2021), arXiv:2101.11149. arxiv:2101.11149

[19]

Wen Zhou, Seohyun Kim, Vijayaraghavan Murali, and Gareth Ari Aye. 2021. Improving Code Autocompletion with Transfer Learning. CoRR, abs/2105.05991 (2021), arXiv:2105.05991. arxiv:2105.05991

Cited By

Feldman MAnderson C(2024)Non-Expert Programmers in the Generative AI FutureProceedings of the 3rd Annual Meeting of the Symposium on Human-Computer Interaction for Work10.1145/3663384.3663393(1-19)Online publication date: 25-Jun-2024
https://dl.acm.org/doi/10.1145/3663384.3663393
Weber TBrandmaier MSchmidt AMayer S(2024)Significant Productivity Gains through Programming with Large Language ModelsProceedings of the ACM on Human-Computer Interaction10.1145/36611458:EICS(1-29)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3661145
Cheng RWang RZimmermann TFord D(2024)“It would work for me too”: How Online Communities Shape Software Developers’ Trust in AI-Powered Code Generation ToolsACM Transactions on Interactive Intelligent Systems10.1145/365199014:2(1-39)Online publication date: 9-Mar-2024
https://dl.acm.org/doi/10.1145/3651990
Show More Cited By

Index Terms

Productivity assessment of neural code completion
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Language models
2. Software and its engineering
  1. Software creation and management
    1. Software development techniques
      1. Automatic programming

Recommendations

Exploring and Improving Code Completion for Test Code
ICPC '24: Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension

Code completion is an important feature in Integrated Development Environments (IDEs). These years, researchers have been making efforts for intelligent code completion. However, existing work on intelligent code completion either only considered ...
Read More
Principled syntactic code completion using placeholders
SLE 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Software Language Engineering

Principled syntactic code completion enables developers to change source code by inserting code templates, thus increasing developer efficiency and supporting language exploration. However, existing code completion systems are ad-hoc and neither ...
Read More
The hidden cost of code completion: understanding the impact of the recommendation-list length on its efficiency
MSR '18: Proceedings of the 15th International Conference on Mining Software Repositories

Automatic code completion is a useful and popular technique that software developers use to write code more effectively and efficiently. However, while the benefits of code completion are clear, its cost is yet not well understood. We hypothesize the ...
Read More

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MAPS 2022: Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming

June 2022

79 pages

ISBN:9781450392730

DOI:10.1145/3520312

General Chairs:
Swarat Chaudhuri
University of Texas at Austin, USA
,
Charles Sutton
Google Research, USA

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MAPS '22

Sponsor:

SIGPLAN

MAPS '22: 6th ACM SIGPLAN International Symposium on Machine Programming

June 13, 2022

CA, San Diego, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

40
Total Citations
View Citations
8,470
Total Downloads

Downloads (Last 12 months)4,312
Downloads (Last 6 weeks)379

Other Metrics

View Author Metrics

Citations

Cited By

Feldman MAnderson C(2024)Non-Expert Programmers in the Generative AI FutureProceedings of the 3rd Annual Meeting of the Symposium on Human-Computer Interaction for Work10.1145/3663384.3663393(1-19)Online publication date: 25-Jun-2024
https://dl.acm.org/doi/10.1145/3663384.3663393
Weber TBrandmaier MSchmidt AMayer S(2024)Significant Productivity Gains through Programming with Large Language ModelsProceedings of the ACM on Human-Computer Interaction10.1145/36611458:EICS(1-29)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3661145
Cheng RWang RZimmermann TFord D(2024)“It would work for me too”: How Online Communities Shape Software Developers’ Trust in AI-Powered Code Generation ToolsACM Transactions on Interactive Intelligent Systems10.1145/365199014:2(1-39)Online publication date: 9-Mar-2024
https://dl.acm.org/doi/10.1145/3651990
El Haji KBrandt CZaidman ASaadatmand MLonetti FBudnik CLi JGuerriero A(2024)Using GitHub Copilot for Test Generation in Python: An Empirical StudyProceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)10.1145/3644032.3644443(45-55)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3644032.3644443
Siddiq MRoney LZhang JSantos JSpinellis DConstantinou EBacchelli A(2024)Quality Assessment of ChatGPT Generated Code and their Use by DevelopersProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3645071(152-156)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3645071
Tufano RMastropaolo APepe FDabic ODi Penta MBavota GSpinellis DConstantinou EBacchelli A(2024)Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644918(571-583)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644918
Siddiq MZhang JSantos JBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)Understanding Regular Expression Denial of Service (ReDoS): Insights from LLM-Generated Regexes and Developer ForumsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644424(190-201)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644424
Mendes WSouza SDe Souza CGraziotin DNolte A(2024)"You're on a bicycle with a little motor": Benefits and Challenges of Using AI Code AssistantsProceedings of the 2024 IEEE/ACM 17th International Conference on Cooperative and Human Aspects of Software Engineering10.1145/3641822.3641882(144-152)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3641822.3641882
Sikand SPhokela KSharma VSingi KKaulgud VTung TSharma PBurden A(2024)How much SPACE do metrics have in GenAI assisted software development?Proceedings of the 17th Innovations in Software Engineering Conference10.1145/3641399.3641419(1-5)Online publication date: 22-Feb-2024
https://dl.acm.org/doi/10.1145/3641399.3641419
Qian CWexler J(2024)Take It, Leave It, or Fix It: Measuring Productivity and Trust in Human-AI CollaborationProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645198(370-384)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645198
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents