Better together? an evaluation of ai-supported code translation

JD Weisz, M Muller, SI Ross, F Martinez… - Proceedings of the 27th …, 2022 - dl.acm.org
Proceedings of the 27th International Conference on Intelligent User Interfaces, 2022dl.acm.org
Generative machine learning models have recently been applied to source code, for use
cases including translating code between programming languages, creating documentation
from code, and auto-completing methods. Yet, state-of-the-art models often produce code
that is erroneous or incomplete. In a controlled study with 32 software engineers, we
examined whether such imperfect outputs are helpful in the context of Java-to-Python code
translation. When aided by the outputs of a code translation model, participants produced …
Generative machine learning models have recently been applied to source code, for use cases including translating code between programming languages, creating documentation from code, and auto-completing methods. Yet, state-of-the-art models often produce code that is erroneous or incomplete. In a controlled study with 32 software engineers, we examined whether such imperfect outputs are helpful in the context of Java-to-Python code translation. When aided by the outputs of a code translation model, participants produced code with fewer errors than when working alone. We also examined how the quality and quantity of AI translations affected the work process and quality of outcomes, and observed that providing multiple translations had a larger impact on the translation process than varying the quality of provided translations. Our results tell a complex, nuanced story about the benefits of generative code models and the challenges software engineers face when working with their outputs. Our work motivates the need for intelligent user interfaces that help software engineers effectively work with generative code models in order to understand and evaluate their outputs and achieve superior outcomes to working alone.
ACM Digital Library