research-article

Open access

code2vec: learning distributed representations of code

Authors:

Meital Zilberstein,

Eran YahavAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 3, Issue POPL

Article No.: 40, Pages 1 - 29

https://doi.org/10.1145/3290353

Published: 02 January 2019 Publication History

Abstract

We present a neural model for representing snippets of code as continuous distributed vectors (``code embeddings''). The main idea is to represent a code snippet as a single fixed-length code vector, which can be used to predict semantic properties of the snippet. To this end, code is first decomposed to a collection of paths in its abstract syntax tree. Then, the network learns the atomic representation of each path while simultaneously learning how to aggregate a set of them.

We demonstrate the effectiveness of our approach by using it to predict a method's name from the vector representation of its body. We evaluate our approach by training a model on a dataset of 12M methods. We show that code vectors trained on this dataset can predict method names from files that were unobserved during training. Furthermore, we show that our model learns useful method name vectors that capture semantic similarities, combinations, and analogies.

A comparison of our approach to previous techniques over the same dataset shows an improvement of more than 75%, making it the first to successfully predict method names based on a large, cross-project corpus. Our trained model, visualizations and vector similarities are available as an interactive online demo at http://code2vec.org. The code, data and trained models are available at https://github.com/tech-srl/code2vec.

Supplementary Material

WEBM File (a40-alon.webm)

Download
78.92 MB

References

[1]

Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2014. Learning Natural Coding Conventions. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014) . ACM, New York, NY, USA, 281–293.

Digital Library

[2]

Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2015a. Suggesting Accurate Method and Class Names. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 38–49.

Digital Library

[3]

Miltiadis Allamanis, Earl T Barr, Premkumar Devanbu, and Charles Sutton. 2017. A Survey of Machine Learning for Big Code and Naturalness. arXiv preprint arXiv:1709.06182 (2017).

[4]

Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to Represent Programs with Graphs. In ICLR .

[5]

Miltiadis Allamanis, Hao Peng, and Charles A. Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016 . 2091–2100. http://jmlr.org/proceedings/papers/v48/allamanis16.html

[6]

Miltiadis Allamanis and Charles Sutton. 2013. Mining Source Code Repositories at Massive Scale Using Language Modeling. In Proceedings of the 10th Working Conference on Mining Software Repositories (MSR ’13). IEEE Press, Piscataway, NJ, USA, 207–216. http://dl.acm.org/citation.cfm?id=2487085.2487127

[7]

Miltiadis Allamanis and Charles Sutton. 2014. Mining Idioms from Source Code. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014) . ACM, New York, NY, USA, 472–483.

Digital Library

[8]

Miltiadis Allamanis, Daniel Tarlow, Andrew D. Gordon, and Yi Wei. 2015b. Bimodal Modelling of Source Code and Natural Language. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML’15) . JMLR.org, 2123–2132. http://dl.acm.org/citation.cfm?id=3045118.3045344

Digital Library

[9]

Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2018. A General Path-based Representation for Predicting Program Properties. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018) . ACM, New York, NY, USA, 404–419.

Digital Library

[10]

Matthew Amodio, Swarat Chaudhuri, and Thomas W. Reps. 2017. Neural Attribute Machines for Program Generation. CoRR abs/1705.09231 (2017). arXiv: 1705.09231 http://arxiv.org/abs/1705.09231

[11]

Thierry Artieres et al. 2010. Neural conditional random fields. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics . 177–184.

[12]

Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu. 2014. Multiple object recognition with visual attention. arXiv preprint arXiv:1412.7755 (2014).

[13]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR abs/1409.0473 (2014). http://arxiv.org/abs/1409.0473

[14]

Dzmitry Bahdanau, Jan Chorowski, Dmitriy Serdyuk, Philemon Brakel, and Yoshua Bengio. 2016. End-to-end attentionbased large vocabulary speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on . IEEE, 4945–4949.

Digital Library

[15]

Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A Neural Probabilistic Language Model. J. Mach. Learn. Res. 3 (March 2003), 1137–1155. http://dl.acm.org/citation.cfm?id=944919.944966

Digital Library

[16]

Pavol Bielik, Veselin Raychev, and Martin T. Vechev. 2016. PHOG: Probabilistic Model for Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016 . 2933–2942. http://jmlr.org/proceedings/papers/v48/bielik16.html

Digital Library

[17]

Chris Callison-Burch, Miles Osborne, and Philipp Koehn. 2006. Re-evaluation the role of bleu in machine translation research. In 11th Conference of the European Chapter of the Association for Computational Linguistics.

[18]

Jan K Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. 2015. Attention-based models for speech recognition. In Advances in Neural Information Processing Systems. 577–585.

Digital Library

[19]

Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25th International Conference on Machine Learning (ICML ’08). ACM, New York, NY, USA, 160–167.

Digital Library

[20]

Yaniv David, Nimrod Partush, and Eran Yahav. 2016. Statistical Similarity in Binaries. In PLDI’16: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation .

Digital Library

[21]

Yaniv David, Nimrod Partush, and Eran Yahav. 2017. Similarity of Binaries through re-optimization. In PLDI’17: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation .

Digital Library

[22]

Yaniv David and Eran Yahav. 2014. Tracelet-Based Code Search in Executables. In PLDI’14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation . 349–360.

Digital Library

[23]

Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 6 (1990), 391.

[24]

Daniel DeFreez, Aditya V. Thakur, and Cindy Rubio-González. 2018. Path-based Function Embedding and Its Application to Error-handling Specification Mining. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018) . ACM, New York, NY, USA, 423–433.

Digital Library

[25]

Greg Durrett and Dan Klein. 2015. Neural CRF Parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Vol. 1. 302–312.

[26]

J.R. Firth. 1957. A Synopsis of Linguistic Theory, 1930-1955. https://books.google.co.il/books?id=T8LDtgAACAAJ

[27]

Martin Fowler and Kent Beck. 1999. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional.

Digital Library

[28]

Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics . 249–256.

[29]

Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning (ICML-11). 513–520.

Digital Library

[30]

Tihomir Gvero and Viktor Kuncak. 2015. Synthesizing Java Expressions from Free-form Queries. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015) . ACM, New York, NY, USA, 416–432.

Digital Library

[31]

Zellig S Harris. 1954. Distributional structure. Word 10, 2-3 (1954), 146–162.

[32]

Karl Moritz Hermann, Tomáš Kočiský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching Machines to Read and Comprehend. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1 (NIPS’15) . MIT Press, Cambridge, MA, USA, 1693–1701. http://dl.acm.org/ citation.cfm?id=2969239.2969428

Digital Library

[33]

Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the Naturalness of Software. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). IEEE Press, Piscataway, NJ, USA, 837–847. http://dl.acm.org/citation.cfm?id=2337223.2337322

[34]

Einar W. Høst and Bjarte M. Østvold. 2009. Debugging Method Names. In Proceedings of the 23rd European Conference on ECOOP 2009 — Object-Oriented Programming (Genoa) . Springer-Verlag, Berlin, Heidelberg, 294–317.

Digital Library

[35]

Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers . http://aclweb.org/anthology/P/P16/P16-1195.pdf

[36]

Omer Katz, Ran El-Yaniv, and Eran Yahav. 2016. Estimating Types in Executables using Predictive Modeling. In POPL’16: Proceedings of the ACM SIGPLAN Conference on Principles of Programming Languages .

Digital Library

[37]

Omer Katz, Noam Rinetzky, and Eran Yahav. 2018. Statistical Reconstruction of Class Hierarchies in Binaries. In ASPLOS’18: Proceedings of the ACM Conference on Architectural Support for Programming Languages and Operating Systems .

Digital Library

[38]

Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[39]

Quoc Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), Tony Jebara and Eric P. Xing (Eds.). JMLR Workshop and Conference Proceedings, 1188–1196. http://jmlr.org/proceedings/papers/v32/le14.pdf

Digital Library

[40]

Omer Levy and Yoav Goldberg. 2014a. Linguistic regularities in sparse and explicit word representations. In Proceedings of the 18th Conference on Computational Natural Language Learning . 171–180.

[41]

Omer Levy and Yoav Goldberg. 2014b. Neural Word Embeddings as Implicit Matrix Factorization. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada . 2177–2185.

Digital Library

[42]

Omer Levy, Minjoon Seo, Eunsol Choi, and Luke Zettlemoyer. 2017. Zero-Shot Relation Extraction via Reading Comprehension. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada, August 3-4, 2017 . 333–342.

[43]

Cristina V. Lopes, Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. 2017. DéJàVu: A Map of Code Duplicates on GitHub. Proc. ACM Program. Lang. 1, OOPSLA, Article 84 (Oct. 2017), 28 pages.

Digital Library

[44]

Yanxin Lu, Swarat Chaudhuri, Chris Jermaine, and David Melski. 2017. Data-Driven Program Completion. CoRR abs/1705.09042 (2017). arXiv: 1705.09042 http://arxiv.org/abs/1705.09042

[45]

Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015 . 1412–1421. http://aclweb.org/anthology/D/D15/D15-1166.pdf

[46]

Chris J. Maddison and Daniel Tarlow. 2014. Structured Generative Models of Natural Source Code. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML’14) . JMLR.org, II–649–II–657. http://dl.acm.org/citation.cfm?id=3044805.3044965

Digital Library

[47]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781

[48]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013b. Distributed Representations of Words and Phrases and Their Compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS’13) . Curran Associates Inc., USA, 3111–3119. http://dl.acm.org/citation.cfm?id=2999792.2999959

Digital Library

[49]

Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013c. Linguistic regularities in continuous space word representations.

[50]

Alon Mishne, Sharon Shoham, and Eran Yahav. 2012. Typestate-based Semantic Code Search over Partial Programs. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA ’12) . ACM, New York, NY, USA, 997–1016.

Digital Library

[51]

Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu. 2014. Recurrent Models of Visual Attention. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14) . MIT Press, Cambridge, MA, USA, 2204–2212. http://dl.acm.org/citation.cfm?id=2969033.2969073

Digital Library

[52]

Dana Movshovitz-Attias and William W Cohen. 2013. Natural language models for predicting programming comments. (2013).

[53]

Vijayaraghavan Murali, Swarat Chaudhuri, and Chris Jermaine. 2017. Bayesian Sketch Learning for Program Synthesis. CoRR abs/1703.05698 (2017). arXiv: 1703.05698 http://arxiv.org/abs/1703.05698

[54]

Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2013. A Statistical Semantic Language Model for Source Code. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013) . ACM, New York, NY, USA, 532–542.

Digital Library

[55]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP) . 1532–1543. http://www.aclweb.org/anthology/D14-1162

[56]

Veselin Raychev, Pavol Bielik, and Martin Vechev. 2016a. Probabilistic Model for Code with Decision Trees. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016) . ACM, New York, NY, USA, 731–747.

Digital Library

[57]

Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause. 2016b. Learning Programs from Noisy Data. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’16) . ACM, New York, NY, USA, 761–774.

Digital Library

[58]

Veselin Raychev, Martin Vechev, and Andreas Krause. 2015. Predicting Program Properties from "Big Code". In Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’15) . ACM, New York, NY, USA, 111–124.

Digital Library

[59]

Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code Completion with Statistical Language Models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14) . ACM, New York, NY, USA, 419–428.

Digital Library

[60]

Reuven Rubinstein. 1999. The cross-entropy method for combinatorial and continuous optimization. Methodology and Computing in Applied Probability 1, 2 (1999), 127–190.

Digital Library

[61]

Reuven Y Rubinstein. 2001. Combinatorial optimization, cross-entropy, ants and rare events. Stochastic Optimization: Algorithms and Applications 54 (2001), 303–363.

[62]

Gerard Salton, Anita Wong, and Chung-Shu Yang. 1975. A vector space model for automatic indexing. Commun. ACM 18, 11 (1975), 613–620.

Digital Library

[63]

Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2016. Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016).

[64]

Richard Socher, Cliff C. Lin, Andrew Y. Ng, and Christopher D. Manning. 2011. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. In Proceedings of the 26th International Conference on Machine Learning (ICML).

Digital Library

[65]

Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929–1958.

Digital Library

[66]

Grigorios Tsoumakas and Ioannis Katakis. 2006. Multi-label classification: An overview. International Journal of Data Warehousing and Mining 3, 3 (2006).

[67]

Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word Representations: A Simple and General Method for Semisupervised Learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL ’10). Association for Computational Linguistics, Stroudsburg, PA, USA, 384–394. http://dl.acm.org/citation.cfm?id=1858681. 1858721

Digital Library

[68]

Peter D Turney. 2006. Similarity of semantic relations. Computational Linguistics 32, 3 (2006), 379–416.

Digital Library

[69]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 6000–6010.

Digital Library

[70]

Martin T. Vechev and Eran Yahav. 2016. Programming with "Big Code". Foundations and Trends in Programming Languages 3, 4 (2016), 231–284.

Digital Library

[71]

Martin White, Christopher Vendome, Mario Linares-Vásquez, and Denys Poshyvanyk. 2015. Toward Deep Learning Software Repositories. In Proceedings of the 12th Working Conference on Mining Software Repositories (MSR ’15). IEEE Press, Piscataway, NJ, USA, 334–345. http://dl.acm.org/citation.cfm?id=2820518.2820559

Digital Library

[72]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International Conference on Machine Learning . 2048–2057.

Digital Library

[73]

Meital Zilberstein and Eran Yahav. 2016. Leveraging a Corpus of Natural Language Descriptions for Program Similarity. In Proceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward! 2016) . ACM, New York, NY, USA, 197–211.

Digital Library

Cited By

Balla SGabbrielli MZacchiroli S(2024)Code stylometry vs formatting and minificationPeerJ Computer Science10.7717/peerj-cs.214210(e2142)Online publication date: 6-Sep-2024
https://doi.org/10.7717/peerj-cs.2142
Bagheri AHegedűs P(2024)Towards a Block-Level Conformer-Based Python Vulnerability DetectionSoftware10.3390/software30300163:3(310-327)Online publication date: 31-Jul-2024
https://doi.org/10.3390/software3030016
Folea RSlusanschi E(2024)Code Comments: A Way of Identifying Similarities in the Source CodeMathematics10.3390/math1207107312:7(1073)Online publication date: 2-Apr-2024
https://doi.org/10.3390/math12071073
Show More Cited By

Index Terms

code2vec: learning distributed representations of code
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Learning latent representations
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages

Recommendations

The adverse effects of code duplication in machine learning models of code
Onward! 2019: Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software

The field of big code relies on mining large corpora of code to perform some learning task towards creating better tools for software engineers. A significant threat to this approach was recently identified by Lopes et al. (2017) who found a large ...
Improvements to code2vec: Generating path vectors using RNN
Abstract
Source code analysis has many application scenarios, such as code plagiarism detection and software vulnerability search. Source code analysis can benefit from machine learning, but it typically requires a standard vector representation and ...
Predicting Program Properties from "Big Code"
POPL '15: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages

We present a new approach for predicting program properties from massive codebases (aka "Big Code"). Our approach first learns a probabilistic model from existing data and then uses this model to predict properties of new, unseen programs.

The key idea ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 3, Issue POPL

January 2019

2275 pages

EISSN:2475-1421

DOI:10.1145/3302515

Issue’s Table of Contents

Copyright © 2019 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 January 2019

Published in PACMPL Volume 3, Issue POPL

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

706
Total Citations
View Citations
12,095
Total Downloads

Downloads (Last 12 months)1,760
Downloads (Last 6 weeks)162

Reflects downloads up to 04 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Balla SGabbrielli MZacchiroli S(2024)Code stylometry vs formatting and minificationPeerJ Computer Science10.7717/peerj-cs.214210(e2142)Online publication date: 6-Sep-2024
https://doi.org/10.7717/peerj-cs.2142
Bagheri AHegedűs P(2024)Towards a Block-Level Conformer-Based Python Vulnerability DetectionSoftware10.3390/software30300163:3(310-327)Online publication date: 31-Jul-2024
https://doi.org/10.3390/software3030016
Folea RSlusanschi E(2024)Code Comments: A Way of Identifying Similarities in the Source CodeMathematics10.3390/math1207107312:7(1073)Online publication date: 2-Apr-2024
https://doi.org/10.3390/math12071073
Bani-Hani RShatnawi AAl-Yahya L(2024)Vulnerability Detection and Classification of Ethereum Smart Contracts Using Deep LearningFuture Internet10.3390/fi1609032116:9(321)Online publication date: 4-Sep-2024
https://doi.org/10.3390/fi16090321
Wu XChen H(2024)Augmented Feature Diffusion on Sparsely Sampled SubgraphElectronics10.3390/electronics1316324913:16(3249)Online publication date: 15-Aug-2024
https://doi.org/10.3390/electronics13163249
Shi YYin YYu MChu L(2024)CogCol: Code Graph-Based Contrastive Learning Model for Code SummarizationElectronics10.3390/electronics1310181613:10(1816)Online publication date: 8-May-2024
https://doi.org/10.3390/electronics13101816
Li ZLei HMa ZZhang F(2024)Code Similarity Prediction Model for Industrial Management Features Based on Graph Neural NetworksEntropy10.3390/e2606050526:6(505)Online publication date: 9-Jun-2024
https://doi.org/10.3390/e26060505
Aladics THegedűs PFerenc R(2024)A Comparative Study of Commit Representations for JIT Vulnerability PredictionComputers10.3390/computers1301002213:1(22)Online publication date: 11-Jan-2024
https://doi.org/10.3390/computers13010022
Bibi NMaqbool ARana TAfzal FKhan A(2024)C2B: A Semantic Source Code Retrieval Model Using CodeT5 and Bi-LSTMApplied Sciences10.3390/app1413579514:13(5795)Online publication date: 2-Jul-2024
https://doi.org/10.3390/app14135795
Oedingen MEngelhardt RDenz RHammer MKonen W(2024)ChatGPT Code Detection: Techniques for Uncovering the Source of CodeAI10.3390/ai50300535:3(1066-1094)Online publication date: 2-Jul-2024
https://doi.org/10.3390/ai5030053
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents