research-article

Natural Language to Code: How Far Are We?

Authors:

Bo Lin,

Li Li,

Tegawendé F. Bissyandé,

Xiaoguang MaoAuthors Info & Claims

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 375 - 387

https://doi.org/10.1145/3611643.3616323

Published: 30 November 2023 Publication History

Get Access

Abstract

A longstanding dream in software engineering research is to devise effective approaches for automating development tasks based on developers' informally-specified intentions. Such intentions are generally in the form of natural language descriptions. In recent literature, a number of approaches have been proposed to automate tasks such as code search and even code generation based on natural language inputs. While these approaches vary in terms of technical designs, their objective is the same: transforming a developer's intention into source code. The literature, however, lacks a comprehensive understanding towards the effectiveness of existing techniques as well as their complementarity to each other. We propose to fill this gap through a large-scale empirical study where we systematically evaluate natural language to code techniques. Specifically, we consider six state-of-the-art techniques targeting code search, and four targeting code generation. Through extensive evaluations on a dataset of 22K+ natural language queries, our study reveals the following major findings: (1) code search techniques based on model pre-training are so far the most effective while code generation techniques can also provide promising results; (2) complementarity widely exists among the existing techniques; and (3) combining the ten techniques together can enhance the performance for 35% compared with the most effective standalone technique. Finally, we propose a post-processing strategy to automatically integrate different techniques based on their generated code. Experimental results show that our devised strategy is both effective and extensible.

Supplementary Material

Video (fse23main-p713-p-video.mp4)

"Code reuse is an important part of software development. The adoption of code reuse practices is especially common among Node.js developers. The Node.js package manager, NPM, indexes over 1 Million packages and developers often seek out packages to solve programming tasks. Due to the vast number of packages, selecting the right package is difficult and time consuming. With the goal of improving productivity of developers that heavily reuse code through third-party packages, we present Node Code Query (NCQ), a Read-Eval-Print-Loop environment that allows developers to 1) search for NPM packages using natural language queries, 2) search for code snippets related to those packages, 3) automatically correct errors in these code snippets, 4) quickly setup new environments for testing those snippets, and 5) transition between search and editing modes. In two user studies with a total of 20 participants, we find that participants begin programming faster and conclude tasks faster with NCQ than with baseline approaches, and that they like, among other features, the search for code snippets and packages. Our results suggest that NCQ makes Node.js developers more efficient in reusing code. "

Download
63.78 MB

References

[1]

[n.d.]. Intermediate representation. https://en.wikipedia.org/wiki/Intermediate_representation

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Two Birds with One Stone: Boosting Code Generation and Code Search via a Generative Adversarial Network

Better code search and reuse for better program repair

Code Search: A Survey of Techniques for Finding Code

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Badges

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations