skip to main content
10.1145/3558489.3559072acmconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
research-article
Open access

Assessing the quality of GitHub copilot’s code generation

Published: 09 November 2022 Publication History
  • Get Citation Alerts
  • Abstract

    The introduction of GitHub’s new code generation tool, GitHub Copilot, seems to be the first well-established instance of an AI pair-programmer. GitHub Copilot has access to a large number of open-source projects, enabling it to utilize more extensive code in various programming languages than other code generation tools. Although the initial and informal assessments are promising, a systematic evaluation is needed to explore the limits and benefits of GitHub Copilot. The main objective of this study is to assess the quality of generated code provided by GitHub Copilot. We also aim to evaluate the impact of the quality and variety of input parameters fed to GitHub Copilot. To achieve this aim, we created an experimental setup for evaluating the generated code in terms of validity, correctness, and efficiency. Our results suggest that GitHub Copilot was able to generate valid code with a 91.5% success rate. In terms of code correctness, out of 164 problems, 47 (28.7%) were correctly, while 84 (51.2%) were partially correctly, and 33 (20.1%) were incorrectly generated. Our empirical analysis shows that GitHub Copilot is a promising tool based on the results we obtained, however further and more comprehensive assessment is needed in the future.

    References

    [1]
    Matt Asay. 2021. GitHub copilot isn’t changing the future. https://www.infoworld.com/article/3625517/github-copilot-isnt-changing-the-future.html
    [2]
    Scott Carey. 2021. Developers react to github copilot. https://www.infoworld.com/article/3624688/developers-react-to-github-copilot.html
    [3]
    Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. https://doi.org/10.48550/ARXIV.2107.03374
    [4]
    Neil A. Ernst and Gabriele Bavota. 2022. AI-Driven Development Is Here: Should You Worry? IEEE Software, 39, 2 (2022), 106–110. https://doi.org/10.1109/MS.2021.3133805
    [5]
    Shirley Anugrah Hayati, Raphael Olivier, Pravalika Avvaru, Pengcheng Yin, Anthony Tomasic, and Graham Neubig. 2018. Retrieval-Based Neural Code Generation. https://doi.org/10.48550/ARXIV.1808.10025
    [6]
    JetBrains. 2022. GitHub copilot - intellij IDES plugin: Marketplace. https://plugins.jetbrains.com/plugin/17718-github-copilot
    [7]
    Renato Losio. 2021. GitHub previews copilot, an openai-powered coding assistant. https://www.infoq.com/news/2021/07/github-copilot-pair-programmming/
    [8]
    Chen Lyu, Ruyun Wang, Hongyu Zhang, Hanwen Zhang, and Songlin Hu. 2021. Embedding API dependency graph for neural code generation. Empirical Software Engineering, 26, 4 (2021), 21 Apr, 61. issn:1573-7616 https://doi.org/10.1007/s10664-021-09968-2
    [9]
    Nhan Nguyen and Sarah Nadi. 2022. An Empirical Evaluation of GitHub Copilot’s Code Suggestions. In 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR). 1–5. https://doi.org/10.1145/3524842.3528470
    [10]
    Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2021. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions. https://doi.org/10.48550/ARXIV.2108.09293
    [11]
    Dominik Sobania, Martin Briesch, and Franz Rothlauf. 2022. Choose Your Programming Copilot: A Comparison of the Program Synthesis Performance of Github Copilot and Genetic Programming. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’22). Association for Computing Machinery, New York, NY, USA. 1019–1027. isbn:9781450392372 https://doi.org/10.1145/3512290.3528700
    [12]
    Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou, and Lu Zhang. 2020. TreeGen: A Tree-Based Transformer Architecture for Code Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 05 (2020), Apr., 8984–8991. https://doi.org/10.1609/aaai.v34i05.6430
    [13]
    Darryl K. Taft. 2021. GitHub copilot: A powerful, controversial autocomplete for developers. https://thenewstack.io/github-copilot-a-powerful-controversial-autocomplete-for-developers/
    [14]
    Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA ’22). Association for Computing Machinery, New York, NY, USA. Article 332, 7 pages. isbn:9781450391566 https://doi.org/10.1145/3491101.3519665
    [15]
    Frank F. Xu, Bogdan Vasilescu, and Graham Neubig. 2022. In-IDE Code Generation from Natural Language: Promise and Challenges. ACM Trans. Softw. Eng. Methodol., 31, 2 (2022), Article 29, mar, 47 pages. issn:1049-331X https://doi.org/10.1145/3487569
    [16]
    Maosheng Zhong, Gen Liu, Hongwei Li, Jiangling Kuang, Jinshan Zeng, and Mingwen Wang. 2022. CodeGen-Test: An Automatic Code Generation Model Integrating Program Test Information. https://doi.org/10.48550/ARXIV.2202.07612

    Cited By

    View all
    • (2024)The Current State of Generative Artificial Intelligence Tools for Accessibility in Product DevelopmentNafath10.54455/MCN26059:26Online publication date: 30-Jul-2024
    • (2024)Framework for evaluating code generation ability of large language modelsETRI Journal10.4218/etrij.2023-035746:1(106-117)Online publication date: 14-Feb-2024
    • (2024)Program Code Generation with Generative AIsAlgorithms10.3390/a1702006217:2(62)Online publication date: 31-Jan-2024
    • Show More Cited By

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PROMISE 2022: Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering
    November 2022
    101 pages
    ISBN:9781450398602
    DOI:10.1145/3558489
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 November 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. AI pair programmer
    2. GitHub Copilot
    3. code completion
    4. code generation
    5. empirical study

    Qualifiers

    • Research-article

    Conference

    PROMISE '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 98 of 213 submissions, 46%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3,299
    • Downloads (Last 6 weeks)257
    Reflects downloads up to 14 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The Current State of Generative Artificial Intelligence Tools for Accessibility in Product DevelopmentNafath10.54455/MCN26059:26Online publication date: 30-Jul-2024
    • (2024)Framework for evaluating code generation ability of large language modelsETRI Journal10.4218/etrij.2023-035746:1(106-117)Online publication date: 14-Feb-2024
    • (2024)Program Code Generation with Generative AIsAlgorithms10.3390/a1702006217:2(62)Online publication date: 31-Jan-2024
    • (2024)Towards AI for Software SystemsProceedings of the 1st ACM International Conference on AI-Powered Software10.1145/3664646.3664767(79-84)Online publication date: 10-Jul-2024
    • (2024)Significant Productivity Gains through Programming with Large Language ModelsProceedings of the ACM on Human-Computer Interaction10.1145/36611458:EICS(1-29)Online publication date: 17-Jun-2024
    • (2024)Do Large Language Models Pay Similar Attention Like Human Programmers When Generating Code?Proceedings of the ACM on Software Engineering10.1145/36608071:FSE(2261-2284)Online publication date: 12-Jul-2024
    • (2024)State Reconciliation Defects in Infrastructure as CodeProceedings of the ACM on Software Engineering10.1145/36607901:FSE(1865-1888)Online publication date: 12-Jul-2024
    • (2024)How Do Information Technology Professionals Use Generative Artificial Intelligence?Proceedings of the 20th Brazilian Symposium on Information Systems10.1145/3658321.3658367(1-9)Online publication date: 23-May-2024
    • (2024)An Assessment of ML-based Sentiment Analysis for Intelligent Web FilteringProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3652039(80-87)Online publication date: 26-Jun-2024
    • (2024)Performance, Workload, Emotion, and Self-Efficacy of Novice Programmers Using AI Code GenerationProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 110.1145/3649217.3653615(290-296)Online publication date: 3-Jul-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media