skip to main content
research-article
Public Access

An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

Published: 02 September 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Millions of open source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects. First, we mine millions of bug-fixes from the change histories of projects hosted on GitHub in order to extract meaningful examples of such bug-fixes. Next, we abstract the buggy and corresponding fixed code, and use them to train an Encoder-Decoder model able to translate buggy code into its fixed version. In our empirical investigation, we found that such a model is able to fix thousands of unique buggy methods in the wild. Overall, this model is capable of predicting fixed patches generated by developers in 9--50% of the cases, depending on the number of candidate patches we allow it to generate. Also, the model is able to emulate a variety of different Abstract Syntax Tree operations and generate candidate patches in a split second.

    References

    [1]
    Abdulkareem Alali, Huzefa H. Kagdi, and Jonathan I. Maletic. 2008. What’s a typical commit? A characterization of open source software repositories. In Proceedings of the 16th IEEE International Conference on Program Comprehension, ICPC 2008, Amsterdam, The Netherlands, June 10--13, 2008. 182--191.
    [2]
    Miltiadis Allamanis. 2018. The adverse effects of code duplication in machine learning models of code. CoRR abs/1812.06469. http://arxiv.org/abs/1812.06469
    [3]
    Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2015. Suggesting accurate method and class names. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’15). ACM, New York, NY, 38--49.
    [4]
    Giuliano Antoniol, Kamel Ayari, Massimiliano Di Penta, Foutse Khomh, and Yann-Gaël Guéhéneuc. 2008. Is it a bug or an enhancement?: A text-based approach to classify change requests. In Proceedings of the 2008 Conference of the Centre for Advanced Studies on Collaborative Research, October 27--30, 2008, Richmond Hill, Ontario, Canada. 23.
    [5]
    Andrea Arcuri and Xin Yao. 2008. A novel co-evolutionary approach to automatic software bug fixing. In Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2008, June 1--6, 2008, Hong Kong, China. 162--168.
    [6]
    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014). arxiv:1409.0473.
    [7]
    Earl T. Barr, Yuriy Brun, Premkumar Devanbu, Mark Harman, and Federica Sarro. 2014. The plastic surgery hypothesis. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14). ACM, New York, NY, 306--317.
    [8]
    S. Bhatia and R. Singh. 2016. Automated correction for syntax errors in programming assignments using recurrent neural networks. CoRR abs/1603.06129 (2016).
    [9]
    Nicolas Boulanger-Lewandowski, Yoshua Bengio, and Pascal Vincent. 2013. Audio chord recognition with recurrent neural networks. In ISMIR. Citeseer, 335--340.
    [10]
    Denny Britz, Anna Goldie, Minh-Thang Luong, and Quoc V. Le. 2017. Massive exploration of neural machine translation architectures. CoRR abs/1703.03906 (2017). arxiv:1703.03906.
    [11]
    David Bingham Brown, Michael Vaughn, Ben Liblit, and Thomas Reps. 2017. The care and feeding of wild-caught mutants. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 511--522.
    [12]
    Antonio Carzaniga, Alessandra Gorla, Andrea Mattavelli, Nicolò Perino, and Mauro Pezzè. 2013. Automatic recovery from runtime failures. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE Press, Piscataway, NJ, 782--791.
    [13]
    Zimin Chen and Martin Monperrus. 2018. CodRep. https://github.com/KTH/CodRep-competition.
    [14]
    Zimin Chen and Martin Monperrus. 2018. The CodRep Machine Learning on Source Code Competition. Technical Report 1807.03200. arXiv. http://arxiv.org/pdf/1807.03200
    [15]
    Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014). arxiv:1406.1078
    [16]
    Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In ACM/IEEE International Conference on Automated Software Engineering, ASE’14, Vasteras, Sweden, September 15--19, 2014. 313--324.
    [17]
    Michael Fischer, Martin Pinzger, and Harald C. Gall. 2003. Populating a release history database from version control and bug tracking systems. In 19th International Conference on Software Maintenance (ICSM’03), The Architecture of Existing Systems, 22--26 September, 2003, Amsterdam, The Netherlands. 23.
    [18]
    Mark Gabel and Zhendong Su. 2010. A study of the uniqueness of source code. In Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’10). ACM, New York, NY, 147--156.
    [19]
    GitHub. 2010. GitHub Compare API. Retrieved from https://developer.github.com/v3/repos/commits/#compare-two-commits.
    [20]
    C. Le Goues, W. Weimer, and S. Forrest. {n.d.}. Representations and operators for improving evolutionary software repair. (GECCO’12).
    [21]
    Alex Graves. 2012. Sequence transduction with recurrent neural networks. CoRR abs/1211.3711. arxiv:1211.3711 http://arxiv.org/abs/1211.3711
    [22]
    Ilya Grigorik. 2012. GitHub Archive. Retrieved from https://www.githubarchive.org.
    [23]
    Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. 2018. Deep code search. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27-- June 3, 2018.
    [24]
    Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API learning. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, November 13-18, 2016. 631--642.
    [25]
    Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2017. DeepAM: Migrate APIs with multi-modal sequence to sequence learning. CoRR abs/1704.07734 (2017). arxiv:1704.07734.
    [26]
    Rahul Gupta, Aditya Kanade, and Shirish K. Shevade. 2018. Deep reinforcement learning for programming language correction. CoRR abs/1801.10467 (2018). arxiv:1801.10467.
    [27]
    Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug prediction based on fine-grained module histories. Proceedings of the International Conference on Software Engineering (06 2012), 200--210.
    [28]
    Kim Herzig, Sascha Just, and Andreas Zeller. 2013. It’s not a bug, it’s a feature: How misclassification impacts bug prediction. In Proceedings of the 35th International Conference on Software Engineering, (ICSE’13), San Francisco, CA, May 18-26, 2013. 392--401.
    [29]
    Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 837--847.
    [30]
    Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (Nov. 1997), 1735--1780.
    [31]
    Guoliang Jin, Linhai Song, Wei Zhang, Shan Lu, and Ben Liblit. 2011. Automated atomicity-violation fixing. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). ACM, New York, NY, 389--400.
    [32]
    Magne Jorgensen and Martin Shepperd. 2007. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33, 1 (Jan. 2007), 33--53.
    [33]
    René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA’14). ACM, New York, NY, 437--440.
    [34]
    Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington, 1700--1709.
    [35]
    Y. Ke, K. Stolee, C. Le Goues, and Y. Brun. {n.d.}. Repairing programs with semantic code search. ASE’15.
    [36]
    Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In Proceedings of the 35th International Conference on Software Engineering, (ICSE’13), San Francisco, CA, May 18-26, 2013. 802--811.
    [37]
    P. Koehn. 2010. Statistical Machine Translation.
    [38]
    Carsten Kolassa, Dirk Riehle, and Michel A. Salim. 2013. A model of the commit size distribution of open source. In SOFSEM 2013: Theory and Practice of Computer Science, Peter van Emde Boas, Frans C. A. Groen, Giuseppe F. Italiano, Jerzy Nawrocki, and Harald Sack (Eds.). Springer, Berlin, 52--66.
    [39]
    An Ngoc Lam, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2017. Bug localization with combination of deep learning and information retrieval. In Proceedings of the 25th International Conference on Program Comprehension, ICPC 2017, Buenos Aires, Argentina, May 22-23, 2017. 218--229.
    [40]
    X. Le, D. Chu, D. Lo, C. Le Goues, and W. Visser. {n.d.}. S3: Syntax- and Semantic-guided Repair Synthesis via Programming by Examples (FSE’17).
    [41]
    Xuan-Bach D. Le, David Lo, and Claire Le Goues. 2016. History driven program repair. In IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, SANER 2016, Suita, Osaka, Japan, March 14-18, 2016—Volume 1. 213--224.
    [42]
    Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In 34th International Conference on Software Engineering, ICSE 2012, June 2--9, 2012, Zurich, Switzerland. 3--13.
    [43]
    C. Le Goues, N. Holtschulte, E. Smith, Y. Brun, P. Devanbu, S. Forrest, and W. Weimer. 2015. The ManyBugs and IntroClass benchmarks for automated repair of C programs. TSE 41, 12 (2015), 1236--1256.
    [44]
    Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2012. GenProg: A generic method for automatic software repair. IEEE Trans. Software Eng. 38, 1 (2012), 54--72.
    [45]
    Daoyuan Li, Li Li, Dongsun Kim, Tegawendé F. Bissyandé, David Lo, and Yves Le Traon. 2016. Watch out for this commit! A study of influential software changes. CoRR abs/1606.03266 (2016). arxiv:1606.03266 http://arxiv.org/abs/1606.03266
    [46]
    Fan Long, Peter Amidon, and Martin Rinard. 2017. Automatic inference of code transforms for patch generation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 727--739.
    [47]
    Fan Long and Martin Rinard. 2016. Automatic patch generation by learning correct code. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’16). ACM, New York, NY, 298--312.
    [48]
    Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. CoRR abs/1508.04025 (2015). arxiv:1508.04025.
    [49]
    Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2017. Automatic repair of real bugs in Java: A large-scale experiment on the Defects4J dataset. Empirical Software Engineering 22, 4 (2017), 1936--1964.
    [50]
    Matias Martinez, Westley Weimer, and Martin Monperrus. 2014. Do the fix ingredients already exist? An empirical inquiry into the redundancy assumptions of program repair approaches. In Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion’14). ACM, New York, NY, 492--495.
    [51]
    S. Mechtaev, Y. Jooyong, and A. Roychoudhury. {n.d.}. Angelix: Scalable multiline program patch synthesis via symbolic analysis (ICSE’16).
    [52]
    S. Mechtaev, Y. Jooyong, and A. Roychoudhury. {n.d.}. DirectFix: Looking for simple program repairs (ICSE’15).
    [53]
    Martin Monperrus. 2018. Automatic software repair: A bibliography. ACM Comput. Surv. 51, 1, Article 17 (Jan. 2018), 24 pages.
    [54]
    Martin Monperrus and Matias Martinez. 2012. CVS-Vintage: A Dataset of 14 CVS Repositories of Java Software. (Dec. 2012). https://hal.archives-ouvertes.fr/hal-00769121 working paper or preprint.
    [55]
    K. Moran, C. Bernal-Cárdenas, M. Curcio, R. Bonett, and D. Poshyvanyk. 2018. Machine learning-based prototyping of graphical user interfaces for mobile apps. IEEE Trans. Software Eng. (2018).
    [56]
    Anh Tuan Nguyen, Hoan Anh Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2014. Statistical learning approach for mining API usage mappings for code migration. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE’14). ACM, New York, NY, 457--468.
    [57]
    Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2013. Lexical statistical machine translation for language migration. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’13). ACM, New York, NY, 651--654.
    [58]
    Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2014. Migrating code with statistical machine translation. In Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion’14). ACM, New York, NY, 544--547.
    [59]
    Hoan Anh Nguyen, Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N. Nguyen, and Hridesh Rajan. 2013. A study of repetitiveness of code changes in software evolution. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE Press, Piscataway, NJ, 180--190.
    [60]
    Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE Press, Piscataway, NJ, 772--781. http://dl.acm.org/citation.cfm?id=2486788.2486890.
    [61]
    Terence Parr. 2013. The Definitive ANTLR 4 Reference (2nd ed.). Pragmatic Bookshelf.
    [62]
    Terence Parr and Kathleen Fisher. 2011. LL(*): The foundation of the ANTLR parser generator. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). ACM, New York, NY, 425--436.
    [63]
    Jeff H. Perkins, Sunghun Kim, Sam Larsen, Saman Amarasinghe, Jonathan Bachrach, Michael Carbin, Carlos Pacheco, Frank Sherwood, Stelios Sidiroglou, Greg Sullivan, Weng-Fai Wong, Yoav Zibin, Michael D. Ernst, and Martin Rinard. 2009. Automatically patching errors in deployed software. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP’09). ACM, New York, NY, 87--102.
    [64]
    Derrin Pierret and Denys Poshyvanyk. 2009. An empirical exploration of regularities in open-source software lexicons. In the 17th IEEE International Conference on Program Comprehension, ICPC 2009, Vancouver, British Columbia, Canada, May 17--19, 2009. 228--232.
    [65]
    Y. Pu, K. Narasimhan, A. Solar-Lezama, and R. Barzilay. {n.d.}. Sk_P: A Neural Program Corrector for MOOCs (SPLASH Companion 2016).
    [66]
    Z. Qi, F. Long, S. Achour, and M. Rinard. {n.d.}. An Analysis of Patch Plausibility and Correctness for Generate-and-validate Patch Generation Systems (ISSTA’15).
    [67]
    Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code completion with statistical language models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’14). ACM, New York, NY, 419--428.
    [68]
    Chanchal Kumar Roy and James R. Cordy. 2008. NICAD: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In the 16th IEEE International Conference on Program Comprehension, ICPC 2008, Amsterdam, The Netherlands, June 10--13, 2008. 172--181.
    [69]
    Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chanchal K. Roy, and Cristina V. Lopes. 2016. SourcererCC: Scaling code clone detection to big-code. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). ACM, New York, NY, 1157--1168.
    [70]
    Ingo Scholtes, Pavlin Mavrodiev, and Frank Schweitzer. 2016. From aristotle to ringelmann: A large-scale analysis of team productivity and coordination in Open Source Software projects. Empirical Software Eng. 21, 2 (01 Apr. 2016), 642--683.
    [71]
    Robert C. Seacord, Daniel Plakosh, and Grace A. Lewis. 2003. Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices. Addison-Wesley Longman Publishing Co., Inc., Boston, MA.
    [72]
    Stelios Sidiroglou-Douskos, Eric Lahtinen, Fan Long, and Martin Rinard. 2015. Automatic error elimination by horizontal code transfer across multiple applications. SIGPLAN Not. 50, 6 (June 2015), 43--54.
    [73]
    Edward K. Smith, Earl T. Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? Overfitting in automated program repair. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’15). ACM, New York, NY, 532--543.
    [74]
    Victor Sobreira, Thomas Durieux, Fernanda Madeiral Delfim, Martin Monperrus, and Marcelo de Almeida Maia. 2018. Dissection of a bug dataset: Anatomy of 395 patches from Defects4J. In 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, Campobasso, Italy, March 20--23, 2018. 130--140.
    [75]
    Mauricio Soto and Claire Le Goues. 2018. Using a probabilistic model to predict bug fixes. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 221--231.
    [76]
    Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. CoRR abs/1409.3215 (2014). arxiv:1409.3215.
    [77]
    Yuchi Tian and Baishakhi Ray. 2017. Automatically diagnosing and repairing error handling bugs in C. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 752--762.
    [78]
    Michele Tufano, Gabriele Bavota, Denys Poshyvanyk, Massimiliano Di Penta, Rocco Oliveto, and Andrea De Lucia. {n.d.}. An empirical study on developer-related factors characterizing fix-inducing commits. J. Software Evol. Process 29, 1 ({n.d.}), e1797.
    [79]
    M. Tufano, J. Pantiuchina, C. Watson, G. Bavota, and D. Poshyvanyk. 2019. On learning meaningful code changes via neural machine translation. In Proceedings of the 41st ACM/IEEE International Conference on Software Engineering (ICSE’19). ACM, 12.
    [80]
    Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. Deep learning similarities from different representations of source code. In Proceedings of the 15th International Conference on Mining Software Repositories (MSR’18). ACM, New York, NY, 542--553.
    [81]
    Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE’18). ACM, New York, NY, 832--837.
    [82]
    Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. Online Appendix. Retrieved from https://sites.google.com/view/learning-fixes.
    [83]
    Danny van Bruggen. 2014. JavaParser. Retrieved from https://javaparser.org/about.html.
    [84]
    Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, May 14-22, 2016. 297--308.
    [85]
    Westley Weimer, Zachary P. Fry, and Stephanie Forrest. 2013. Leveraging program equivalence for adaptive program repair: Models and first results. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE Press, Piscataway, NJ, 356--366.
    [86]
    Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering, ICSE 2009, May 16--24, 2009, Vancouver, Canada. 364--374.
    [87]
    Aaron Weiss, Arjun Guha, and Yuriy Brun. 2017. Tortoise: Interactive system configuration repair. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). IEEE Press, Piscataway, NJ, 625--636. http://dl.acm.org/citation.cfm?id=3155562.3155641.
    [88]
    Cathrin Weiss, Rahul Premraj, Thomas Zimmermann, and Andreas Zeller. 2007. How long will it take to fix this bug? In Proceedings of the 4th International Workshop on Mining Software Repositories (MSR’07). IEEE Computer Society, Washington, D.C., 1--.
    [89]
    Marty White, Michele Tufano, M. Martinez, M. Monperrus, and D. Poshyvanyk. 2019. Sorting and transforming program repair ingredients via deep learning code similarities. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, to appear.
    [90]
    Martin White, Michele Tufano, Christopher Vendome, and Denys Poshyvanyk. 2016. Deep learning code fragments for code clone detection. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3--7, 2016. 87--98.
    [91]
    M. White, C. Vendome, M. Linares-Vásquez, and D. Poshyvanyk. {n.d.}. Toward deep learning software repositories (MSR’15).
    [92]
    J. Xuan, M. Martínez, F. DeMarco, M. Clément, S. Lamelas, T. Durieux, Daniel Le Berre, and M. Monperrus. 2016. Nopol: Automatic repair of conditional statement bugs in Java programs. IEEE Trans. Software Eng. 43, 1 (2016), 34--55.
    [93]
    Jinqiu Yang, Alexey Zhikhartsev, Yuefei Liu, and Lin Tan. 2017. Better test cases for better automated program repair. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 831--841.
    [94]
    Hao Zhong and Zhendong Su. 2015. An empirical study on real bug fixes. In Proceedings of the 37th International Conference on Software Engineering—Volume 1 (ICSE’15). IEEE Press, Piscataway, NJ, 913--923. http://dl.acm.org/citation.cfm?id=2818754.2818864.
    [95]
    Hao Zhong, Suresh Thummalapenta, Tao Xie, Lu Zhang, and Qing Wang. 2010. Mining API mapping for language migration. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering—Volume 1 (ICSE’10). ACM, New York, NY, 195--204.
    [96]
    Jian Zhou, Hongyu Zhang, and David Lo. 2012. Where should the bugs be fixed?—More accurate information retrieval-based bug localization based on bug reports. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 14--24. http://dl.acm.org/citation.cfm?id=2337223.2337226.

    Cited By

    View all
    • (2024)Exploring the Potential of Pre-Trained Language Models of Code for Automated Program RepairElectronics10.3390/electronics1307120013:7(1200)Online publication date: 25-Mar-2024
    • (2024)Multi-mechanism neural machine translation framework for automatic program repairJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23403746:4(7859-7873)Online publication date: 18-Apr-2024
    • (2024)T-RAP: A Template-guided Retrieval-Augmented Vulnerability Patch Generation ApproachProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3672506(105-114)Online publication date: 24-Jul-2024
    • Show More Cited By

    Index Terms

    1. An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 28, Issue 4
      October 2019
      231 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3360049
      • Editor:
      • Mauro Pezzè
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 September 2019
      Accepted: 01 May 2019
      Revised: 01 February 2019
      Received: 01 September 2018
      Published in TOSEM Volume 28, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Neural machine translation
      2. bug-fixes

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • NSF
      • SNF
      • Swiss National Science Foundation for the CCQR project

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)908
      • Downloads (Last 6 weeks)93
      Reflects downloads up to 14 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Exploring the Potential of Pre-Trained Language Models of Code for Automated Program RepairElectronics10.3390/electronics1307120013:7(1200)Online publication date: 25-Mar-2024
      • (2024)Multi-mechanism neural machine translation framework for automatic program repairJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23403746:4(7859-7873)Online publication date: 18-Apr-2024
      • (2024)T-RAP: A Template-guided Retrieval-Augmented Vulnerability Patch Generation ApproachProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3672506(105-114)Online publication date: 24-Jul-2024
      • (2024)MineCPP: Mining Bug Fix Pairs and Their StructuresCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663797(552-556)Online publication date: 10-Jul-2024
      • (2024)Reality Check: Assessing GPT-4 in Fixing Real-World Software VulnerabilitiesProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661207(252-261)Online publication date: 18-Jun-2024
      • (2024)A Deep Dive into Large Language Models for Automated Bug Localization and RepairProceedings of the ACM on Software Engineering10.1145/36607731:FSE(1471-1493)Online publication date: 12-Jul-2024
      • (2024)Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644918(571-583)Online publication date: 15-Apr-2024
      • (2024)BugsPHP: A dataset for Automated Program Repair in PHPProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644878(128-132)Online publication date: 15-Apr-2024
      • (2024)Towards Summarizing Code Snippets Using Pre-Trained TransformersProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644400(1-12)Online publication date: 15-Apr-2024
      • (2024)Towards AI-Assisted Synthesis of Verified Dafny MethodsProceedings of the ACM on Software Engineering10.1145/36437631:FSE(812-835)Online publication date: 12-Jul-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media