Using pre-trained models to boost code review automation

R Tufano, S Masiero, A Mastropaolo… - Proceedings of the 44th …, 2022 - dl.acm.org
Proceedings of the 44th international conference on software engineering, 2022dl.acm.org
Code review is a practice widely adopted in open source and industrial projects. Given the
non-negligible cost of such a process, researchers started investigating the possibility of
automating specific code review tasks. We recently proposed Deep Learning (DL) models
targeting the automation of two tasks: the first model takes as input a code submitted for
review and implements in it changes likely to be recommended by a reviewer; the second
takes as input the submitted code and a reviewer comment posted in natural language and …
Code review is a practice widely adopted in open source and industrial projects. Given the non-negligible cost of such a process, researchers started investigating the possibility of automating specific code review tasks. We recently proposed Deep Learning (DL) models targeting the automation of two tasks: the first model takes as input a code submitted for review and implements in it changes likely to be recommended by a reviewer; the second takes as input the submitted code and a reviewer comment posted in natural language and automatically implements the change required by the reviewer. While the preliminary results we achieved are encouraging, both models had been tested in rather simple code review scenarios, substantially simplifying the targeted problem. This was also due to the choices we made when designing both the technique and the experiments. In this paper, we build on top of that work by demonstrating that a pre-trained Text-To-Text Transfer Transformer (T5) model can outperform previous DL models for automating code review tasks. Also, we conducted our experiments on a larger and more realistic (and challenging) dataset of code review activities.
ACM Digital Library