research-article

Public Access

An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

Authors:

Michele Tufano,

Gabriele Bavota,

Massimiliano Di Penta,

Denys PoshyvanykAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 28, Issue 4

Article No.: 19, Pages 1 - 29

https://doi.org/10.1145/3340544

Published: 02 September 2019 Publication History

All formats PDF

Abstract

Millions of open source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects. First, we mine millions of bug-fixes from the change histories of projects hosted on GitHub in order to extract meaningful examples of such bug-fixes. Next, we abstract the buggy and corresponding fixed code, and use them to train an Encoder-Decoder model able to translate buggy code into its fixed version. In our empirical investigation, we found that such a model is able to fix thousands of unique buggy methods in the wild. Overall, this model is capable of predicting fixed patches generated by developers in 9--50% of the cases, depending on the number of candidate patches we allow it to generate. Also, the model is able to emulate a variety of different Abstract Syntax Tree operations and generate candidate patches in a split second.

References

[1]

Abdulkareem Alali, Huzefa H. Kagdi, and Jonathan I. Maletic. 2008. What’s a typical commit? A characterization of open source software repositories. In Proceedings of the 16th IEEE International Conference on Program Comprehension, ICPC 2008, Amsterdam, The Netherlands, June 10--13, 2008. 182--191.

Digital Library

[2]

Miltiadis Allamanis. 2018. The adverse effects of code duplication in machine learning models of code. CoRR abs/1812.06469. http://arxiv.org/abs/1812.06469

[3]

Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2015. Suggesting accurate method and class names. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’15). ACM, New York, NY, 38--49.

Digital Library

[4]

Giuliano Antoniol, Kamel Ayari, Massimiliano Di Penta, Foutse Khomh, and Yann-Gaël Guéhéneuc. 2008. Is it a bug or an enhancement?: A text-based approach to classify change requests. In Proceedings of the 2008 Conference of the Centre for Advanced Studies on Collaborative Research, October 27--30, 2008, Richmond Hill, Ontario, Canada. 23.

Digital Library

[5]

Andrea Arcuri and Xin Yao. 2008. A novel co-evolutionary approach to automatic software bug fixing. In Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2008, June 1--6, 2008, Hong Kong, China. 162--168.

[6]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014). arxiv:1409.0473.

[7]

Earl T. Barr, Yuriy Brun, Premkumar Devanbu, Mark Harman, and Federica Sarro. 2014. The plastic surgery hypothesis. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14). ACM, New York, NY, 306--317.

Digital Library

[8]

S. Bhatia and R. Singh. 2016. Automated correction for syntax errors in programming assignments using recurrent neural networks. CoRR abs/1603.06129 (2016).

[9]

Nicolas Boulanger-Lewandowski, Yoshua Bengio, and Pascal Vincent. 2013. Audio chord recognition with recurrent neural networks. In ISMIR. Citeseer, 335--340.

[10]

Denny Britz, Anna Goldie, Minh-Thang Luong, and Quoc V. Le. 2017. Massive exploration of neural machine translation architectures. CoRR abs/1703.03906 (2017). arxiv:1703.03906.

[11]

David Bingham Brown, Michael Vaughn, Ben Liblit, and Thomas Reps. 2017. The care and feeding of wild-caught mutants. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 511--522.

Digital Library

[12]

Antonio Carzaniga, Alessandra Gorla, Andrea Mattavelli, Nicolò Perino, and Mauro Pezzè. 2013. Automatic recovery from runtime failures. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE Press, Piscataway, NJ, 782--791.

Digital Library

[13]

Zimin Chen and Martin Monperrus. 2018. CodRep. https://github.com/KTH/CodRep-competition.

[14]

Zimin Chen and Martin Monperrus. 2018. The CodRep Machine Learning on Source Code Competition. Technical Report 1807.03200. arXiv. http://arxiv.org/pdf/1807.03200

[15]

Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014). arxiv:1406.1078

[16]

Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In ACM/IEEE International Conference on Automated Software Engineering, ASE’14, Vasteras, Sweden, September 15--19, 2014. 313--324.

Digital Library

[17]

Michael Fischer, Martin Pinzger, and Harald C. Gall. 2003. Populating a release history database from version control and bug tracking systems. In 19th International Conference on Software Maintenance (ICSM’03), The Architecture of Existing Systems, 22--26 September, 2003, Amsterdam, The Netherlands. 23.

Digital Library

[18]

Mark Gabel and Zhendong Su. 2010. A study of the uniqueness of source code. In Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’10). ACM, New York, NY, 147--156.

Digital Library

[19]

GitHub. 2010. GitHub Compare API. Retrieved from https://developer.github.com/v3/repos/commits/#compare-two-commits.

[20]

C. Le Goues, W. Weimer, and S. Forrest. {n.d.}. Representations and operators for improving evolutionary software repair. (GECCO’12).

Digital Library

[21]

Alex Graves. 2012. Sequence transduction with recurrent neural networks. CoRR abs/1211.3711. arxiv:1211.3711 http://arxiv.org/abs/1211.3711

[22]

Ilya Grigorik. 2012. GitHub Archive. Retrieved from https://www.githubarchive.org.

[23]

Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. 2018. Deep code search. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27-- June 3, 2018.

Digital Library

[24]

Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API learning. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, November 13-18, 2016. 631--642.

Digital Library

[25]

Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2017. DeepAM: Migrate APIs with multi-modal sequence to sequence learning. CoRR abs/1704.07734 (2017). arxiv:1704.07734.

Digital Library

[26]

Rahul Gupta, Aditya Kanade, and Shirish K. Shevade. 2018. Deep reinforcement learning for programming language correction. CoRR abs/1801.10467 (2018). arxiv:1801.10467.

[27]

Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug prediction based on fine-grained module histories. Proceedings of the International Conference on Software Engineering (06 2012), 200--210.

Digital Library

[28]

Kim Herzig, Sascha Just, and Andreas Zeller. 2013. It’s not a bug, it’s a feature: How misclassification impacts bug prediction. In Proceedings of the 35th International Conference on Software Engineering, (ICSE’13), San Francisco, CA, May 18-26, 2013. 392--401.

Digital Library

[29]

Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 837--847.

Digital Library

[30]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (Nov. 1997), 1735--1780.

Digital Library

[31]

Guoliang Jin, Linhai Song, Wei Zhang, Shan Lu, and Ben Liblit. 2011. Automated atomicity-violation fixing. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). ACM, New York, NY, 389--400.

Digital Library

[32]

Magne Jorgensen and Martin Shepperd. 2007. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33, 1 (Jan. 2007), 33--53.

Digital Library

[33]

René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA’14). ACM, New York, NY, 437--440.

Digital Library

[34]

Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington, 1700--1709.

[35]

Y. Ke, K. Stolee, C. Le Goues, and Y. Brun. {n.d.}. Repairing programs with semantic code search. ASE’15.

Digital Library

[36]

Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In Proceedings of the 35th International Conference on Software Engineering, (ICSE’13), San Francisco, CA, May 18-26, 2013. 802--811.

Digital Library

[37]

P. Koehn. 2010. Statistical Machine Translation.

Digital Library

[38]

Carsten Kolassa, Dirk Riehle, and Michel A. Salim. 2013. A model of the commit size distribution of open source. In SOFSEM 2013: Theory and Practice of Computer Science, Peter van Emde Boas, Frans C. A. Groen, Giuseppe F. Italiano, Jerzy Nawrocki, and Harald Sack (Eds.). Springer, Berlin, 52--66.

[39]

An Ngoc Lam, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2017. Bug localization with combination of deep learning and information retrieval. In Proceedings of the 25th International Conference on Program Comprehension, ICPC 2017, Buenos Aires, Argentina, May 22-23, 2017. 218--229.

Digital Library

[40]

X. Le, D. Chu, D. Lo, C. Le Goues, and W. Visser. {n.d.}. S3: Syntax- and Semantic-guided Repair Synthesis via Programming by Examples (FSE’17).

[41]

Xuan-Bach D. Le, David Lo, and Claire Le Goues. 2016. History driven program repair. In IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, SANER 2016, Suita, Osaka, Japan, March 14-18, 2016—Volume 1. 213--224.

[42]

Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for &dollar;8 each. In 34th International Conference on Software Engineering, ICSE 2012, June 2--9, 2012, Zurich, Switzerland. 3--13.

Digital Library

[43]

C. Le Goues, N. Holtschulte, E. Smith, Y. Brun, P. Devanbu, S. Forrest, and W. Weimer. 2015. The ManyBugs and IntroClass benchmarks for automated repair of C programs. TSE 41, 12 (2015), 1236--1256.

Digital Library

[44]

Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2012. GenProg: A generic method for automatic software repair. IEEE Trans. Software Eng. 38, 1 (2012), 54--72.

Digital Library

[45]

Daoyuan Li, Li Li, Dongsun Kim, Tegawendé F. Bissyandé, David Lo, and Yves Le Traon. 2016. Watch out for this commit! A study of influential software changes. CoRR abs/1606.03266 (2016). arxiv:1606.03266 http://arxiv.org/abs/1606.03266

[46]

Fan Long, Peter Amidon, and Martin Rinard. 2017. Automatic inference of code transforms for patch generation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 727--739.

Digital Library

[47]

Fan Long and Martin Rinard. 2016. Automatic patch generation by learning correct code. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’16). ACM, New York, NY, 298--312.

Digital Library

[48]

Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. CoRR abs/1508.04025 (2015). arxiv:1508.04025.

[49]

Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2017. Automatic repair of real bugs in Java: A large-scale experiment on the Defects4J dataset. Empirical Software Engineering 22, 4 (2017), 1936--1964.

Digital Library

[50]

Matias Martinez, Westley Weimer, and Martin Monperrus. 2014. Do the fix ingredients already exist? An empirical inquiry into the redundancy assumptions of program repair approaches. In Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion’14). ACM, New York, NY, 492--495.

Digital Library

[51]

S. Mechtaev, Y. Jooyong, and A. Roychoudhury. {n.d.}. Angelix: Scalable multiline program patch synthesis via symbolic analysis (ICSE’16).

Digital Library

[52]

S. Mechtaev, Y. Jooyong, and A. Roychoudhury. {n.d.}. DirectFix: Looking for simple program repairs (ICSE’15).

Digital Library

[53]

Martin Monperrus. 2018. Automatic software repair: A bibliography. ACM Comput. Surv. 51, 1, Article 17 (Jan. 2018), 24 pages.

Digital Library

[54]

Martin Monperrus and Matias Martinez. 2012. CVS-Vintage: A Dataset of 14 CVS Repositories of Java Software. (Dec. 2012). https://hal.archives-ouvertes.fr/hal-00769121 working paper or preprint.

[55]

K. Moran, C. Bernal-Cárdenas, M. Curcio, R. Bonett, and D. Poshyvanyk. 2018. Machine learning-based prototyping of graphical user interfaces for mobile apps. IEEE Trans. Software Eng. (2018).

[56]

Anh Tuan Nguyen, Hoan Anh Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2014. Statistical learning approach for mining API usage mappings for code migration. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE’14). ACM, New York, NY, 457--468.

Digital Library

[57]

Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2013. Lexical statistical machine translation for language migration. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’13). ACM, New York, NY, 651--654.

Digital Library

[58]

Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2014. Migrating code with statistical machine translation. In Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion’14). ACM, New York, NY, 544--547.

Digital Library

[59]

Hoan Anh Nguyen, Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N. Nguyen, and Hridesh Rajan. 2013. A study of repetitiveness of code changes in software evolution. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE Press, Piscataway, NJ, 180--190.

Digital Library

[60]

Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE Press, Piscataway, NJ, 772--781. http://dl.acm.org/citation.cfm?id=2486788.2486890.

Digital Library

[61]

Terence Parr. 2013. The Definitive ANTLR 4 Reference (2nd ed.). Pragmatic Bookshelf.

Digital Library

[62]

Terence Parr and Kathleen Fisher. 2011. LL(*): The foundation of the ANTLR parser generator. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). ACM, New York, NY, 425--436.

Digital Library

[63]

Jeff H. Perkins, Sunghun Kim, Sam Larsen, Saman Amarasinghe, Jonathan Bachrach, Michael Carbin, Carlos Pacheco, Frank Sherwood, Stelios Sidiroglou, Greg Sullivan, Weng-Fai Wong, Yoav Zibin, Michael D. Ernst, and Martin Rinard. 2009. Automatically patching errors in deployed software. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP’09). ACM, New York, NY, 87--102.

Digital Library

[64]

Derrin Pierret and Denys Poshyvanyk. 2009. An empirical exploration of regularities in open-source software lexicons. In the 17th IEEE International Conference on Program Comprehension, ICPC 2009, Vancouver, British Columbia, Canada, May 17--19, 2009. 228--232.

[65]

Y. Pu, K. Narasimhan, A. Solar-Lezama, and R. Barzilay. {n.d.}. Sk_P: A Neural Program Corrector for MOOCs (SPLASH Companion 2016).

Digital Library

[66]

Z. Qi, F. Long, S. Achour, and M. Rinard. {n.d.}. An Analysis of Patch Plausibility and Correctness for Generate-and-validate Patch Generation Systems (ISSTA’15).

Digital Library

[67]

Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code completion with statistical language models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’14). ACM, New York, NY, 419--428.

Digital Library

[68]

Chanchal Kumar Roy and James R. Cordy. 2008. NICAD: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In the 16th IEEE International Conference on Program Comprehension, ICPC 2008, Amsterdam, The Netherlands, June 10--13, 2008. 172--181.

Digital Library

[69]

Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chanchal K. Roy, and Cristina V. Lopes. 2016. SourcererCC: Scaling code clone detection to big-code. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). ACM, New York, NY, 1157--1168.

Digital Library

[70]

Ingo Scholtes, Pavlin Mavrodiev, and Frank Schweitzer. 2016. From aristotle to ringelmann: A large-scale analysis of team productivity and coordination in Open Source Software projects. Empirical Software Eng. 21, 2 (01 Apr. 2016), 642--683.

Digital Library

[71]

Robert C. Seacord, Daniel Plakosh, and Grace A. Lewis. 2003. Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices. Addison-Wesley Longman Publishing Co., Inc., Boston, MA.

Digital Library

[72]

Stelios Sidiroglou-Douskos, Eric Lahtinen, Fan Long, and Martin Rinard. 2015. Automatic error elimination by horizontal code transfer across multiple applications. SIGPLAN Not. 50, 6 (June 2015), 43--54.

Digital Library

[73]

Edward K. Smith, Earl T. Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? Overfitting in automated program repair. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’15). ACM, New York, NY, 532--543.

Digital Library

[74]

Victor Sobreira, Thomas Durieux, Fernanda Madeiral Delfim, Martin Monperrus, and Marcelo de Almeida Maia. 2018. Dissection of a bug dataset: Anatomy of 395 patches from Defects4J. In 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, Campobasso, Italy, March 20--23, 2018. 130--140.

[75]

Mauricio Soto and Claire Le Goues. 2018. Using a probabilistic model to predict bug fixes. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 221--231.

[76]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. CoRR abs/1409.3215 (2014). arxiv:1409.3215.

Digital Library

[77]

Yuchi Tian and Baishakhi Ray. 2017. Automatically diagnosing and repairing error handling bugs in C. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 752--762.

Digital Library

[78]

Michele Tufano, Gabriele Bavota, Denys Poshyvanyk, Massimiliano Di Penta, Rocco Oliveto, and Andrea De Lucia. {n.d.}. An empirical study on developer-related factors characterizing fix-inducing commits. J. Software Evol. Process 29, 1 ({n.d.}), e1797.

[79]

M. Tufano, J. Pantiuchina, C. Watson, G. Bavota, and D. Poshyvanyk. 2019. On learning meaningful code changes via neural machine translation. In Proceedings of the 41st ACM/IEEE International Conference on Software Engineering (ICSE’19). ACM, 12.

Digital Library

[80]

Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. Deep learning similarities from different representations of source code. In Proceedings of the 15th International Conference on Mining Software Repositories (MSR’18). ACM, New York, NY, 542--553.

Digital Library

[81]

Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE’18). ACM, New York, NY, 832--837.

Digital Library

[82]

Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. Online Appendix. Retrieved from https://sites.google.com/view/learning-fixes.

[83]

Danny van Bruggen. 2014. JavaParser. Retrieved from https://javaparser.org/about.html.

[84]

Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, May 14-22, 2016. 297--308.

Digital Library

[85]

Westley Weimer, Zachary P. Fry, and Stephanie Forrest. 2013. Leveraging program equivalence for adaptive program repair: Models and first results. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE Press, Piscataway, NJ, 356--366.

Digital Library

[86]

Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering, ICSE 2009, May 16--24, 2009, Vancouver, Canada. 364--374.

Digital Library

[87]

Aaron Weiss, Arjun Guha, and Yuriy Brun. 2017. Tortoise: Interactive system configuration repair. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). IEEE Press, Piscataway, NJ, 625--636. http://dl.acm.org/citation.cfm?id=3155562.3155641.

Digital Library

[88]

Cathrin Weiss, Rahul Premraj, Thomas Zimmermann, and Andreas Zeller. 2007. How long will it take to fix this bug? In Proceedings of the 4th International Workshop on Mining Software Repositories (MSR’07). IEEE Computer Society, Washington, D.C., 1--.

Digital Library

[89]

Marty White, Michele Tufano, M. Martinez, M. Monperrus, and D. Poshyvanyk. 2019. Sorting and transforming program repair ingredients via deep learning code similarities. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, to appear.

[90]

Martin White, Michele Tufano, Christopher Vendome, and Denys Poshyvanyk. 2016. Deep learning code fragments for code clone detection. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3--7, 2016. 87--98.

Digital Library

[91]

M. White, C. Vendome, M. Linares-Vásquez, and D. Poshyvanyk. {n.d.}. Toward deep learning software repositories (MSR’15).

Digital Library

[92]

J. Xuan, M. Martínez, F. DeMarco, M. Clément, S. Lamelas, T. Durieux, Daniel Le Berre, and M. Monperrus. 2016. Nopol: Automatic repair of conditional statement bugs in Java programs. IEEE Trans. Software Eng. 43, 1 (2016), 34--55.

Digital Library

[93]

Jinqiu Yang, Alexey Zhikhartsev, Yuefei Liu, and Lin Tan. 2017. Better test cases for better automated program repair. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 831--841.

Digital Library

[94]

Hao Zhong and Zhendong Su. 2015. An empirical study on real bug fixes. In Proceedings of the 37th International Conference on Software Engineering—Volume 1 (ICSE’15). IEEE Press, Piscataway, NJ, 913--923. http://dl.acm.org/citation.cfm?id=2818754.2818864.

Digital Library

[95]

Hao Zhong, Suresh Thummalapenta, Tao Xie, Lu Zhang, and Qing Wang. 2010. Mining API mapping for language migration. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering—Volume 1 (ICSE’10). ACM, New York, NY, 195--204.

Digital Library

[96]

Jian Zhou, Hongyu Zhang, and David Lo. 2012. Where should the bugs be fixed?—More accurate information retrieval-based bug localization based on bug reports. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 14--24. http://dl.acm.org/citation.cfm?id=2337223.2337226.

Digital Library

Cited By

Hao SShi XLiu H(2024)Exploring the Potential of Pre-Trained Language Models of Code for Automated Program RepairElectronics10.3390/electronics1307120013:7(1200)Online publication date: 25-Mar-2024
https://doi.org/10.3390/electronics13071200
Cao HHan DChu YTian FWang YLiu YJia JGe H(2024)Multi-mechanism neural machine translation framework for automatic program repairJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23403746:4(7859-7873)Online publication date: 18-Apr-2024
https://doi.org/10.3233/JIFS-234037
Liu PLin BQin YWeng CChen L(2024)T-RAP: A Template-guided Retrieval-Augmented Vulnerability Patch Generation ApproachProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3672506(105-114)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3672506
Show More Cited By

Index Terms

An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Maintaining software

Recommendations

An empirical investigation into learning bug-fixing patches in the wild via neural machine translation
ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering

Millions of open-source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an ...
An Empirical Study of Bug Fixing Rate
COMPSAC '15: Proceedings of the 2015 IEEE 39th Annual Computer Software and Applications Conference - Volume 02

Bug fixing is one of the most important activities in software development and maintenance. A software project often employs an issue tracking system such as Bugzilla to store and manage their bugs. In the issue tracking system, many bugs are invalid ...
An Empirical Study on Factors Impacting Bug Fixing Time
WCRE '12: Proceedings of the 2012 19th Working Conference on Reverse Engineering

Fixing bugs is an important activity of the software development process. A typical process of bug fixing consists of the following steps: 1) a user files a bug report, 2) the bug is assigned to a developer, 3) the developer fixes the bug, 4) changed ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 28, Issue 4

October 2019

231 pages

ISSN:1049-331X

EISSN:1557-7392

DOI:10.1145/3360049

Editor:
Mauro Pezzè
Università della Svizzera italiana and Università di Milano-Bicocca, Switzerland

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 September 2019

Accepted: 01 May 2019

Revised: 01 February 2019

Received: 01 September 2018

Published in TOSEM Volume 28, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

NSF
SNF
Swiss National Science Foundation for the CCQR project

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

176
Total Citations
View Citations
3,300
Total Downloads

Downloads (Last 12 months)908
Downloads (Last 6 weeks)93

Reflects downloads up to 14 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hao SShi XLiu H(2024)Exploring the Potential of Pre-Trained Language Models of Code for Automated Program RepairElectronics10.3390/electronics1307120013:7(1200)Online publication date: 25-Mar-2024
https://doi.org/10.3390/electronics13071200
Cao HHan DChu YTian FWang YLiu YJia JGe H(2024)Multi-mechanism neural machine translation framework for automatic program repairJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23403746:4(7859-7873)Online publication date: 18-Apr-2024
https://doi.org/10.3233/JIFS-234037
Liu PLin BQin YWeng CChen L(2024)T-RAP: A Template-guided Retrieval-Augmented Vulnerability Patch Generation ApproachProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3672506(105-114)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3672506
Avula SMondal Sd'Amorim M(2024)MineCPP: Mining Bug Fix Pairs and Their StructuresCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663797(552-556)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663797
Ságodi ZAntal GBogenfürst BIsztin MHegedűs PFerenc R(2024)Reality Check: Assessing GPT-4 in Fixing Real-World Software VulnerabilitiesProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661207(252-261)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661207
Hossain SJiang NZhou QLi XChiang WLyu YNguyen HTripp O(2024)A Deep Dive into Large Language Models for Automated Bug Localization and RepairProceedings of the ACM on Software Engineering10.1145/36607731:FSE(1471-1493)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660773
Tufano RMastropaolo APepe FDabic ODi Penta MBavota GSpinellis DConstantinou EBacchelli A(2024)Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644918(571-583)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644918
Pramod DDe Silva TThabrew UShariffdeen RWickramanayake SSpinellis DConstantinou EBacchelli A(2024)BugsPHP: A dataset for Automated Program Repair in PHPProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644878(128-132)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644878
Mastropaolo ACiniselli MPascarella LTufano RAghajani EBavota GBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)Towards Summarizing Code Snippets Using Pre-Trained TransformersProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644400(1-12)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644400
Misu MLopes CMa INoble J(2024)Towards AI-Assisted Synthesis of Verified Dafny MethodsProceedings of the ACM on Software Engineering10.1145/36437631:FSE(812-835)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643763
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents