research-article

Can large language models reason about program invariants?

AUTHORs:

Charles Sutton,

Pengcheng YinAuthors Info & Claims

ICML'23: Proceedings of the 40th International Conference on Machine Learning

Article No.: 1144, Pages 27496 - 27520

Published: 23 July 2023 Publication History

Abstract

Identifying invariants is an important program analysis task with applications towards program understanding, bug finding, vulnerability analysis, and formal verification. Existing tools for identifying program invariants rely on dynamic analysis, requiring traces collected from multiple executions in order to produce reliable invariants. We study the application of large language models to invariant prediction, finding that models trained on source code and fine-tuned for invariant generation can perform invariant prediction as static rather than dynamic analysis. Using a scratch-pad approach where invariants are predicted sequentially through a program gives the best performance, finding invariants statically of quality comparable to those obtained by a dynamic analysis tool with access to five program traces.

References

[1]

Anand, S., Burke, E. K., Chen, T. Y., Clark, J., Cohen, M. B., Grieskamp, W., Harman, M., Harrold, M. J., McMinn, P., Bertolino, A., Jenny Li, J., and Zhu, H. An orchestrated survey of methodologies for automated software test case generation. J. Syst. Softw., 86(8):1978-2001, August 2013.

Digital Library

[2]

Austin, J., Odena, A., Nye, M., Bosma, M., Michalewski, H., Dohan, D., Jiang, E., Cai, C., Terry, M., Le, Q., and Sutton, C. Program synthesis with large language models. August 2021.

[3]

Bieber, D., Sutton, C., Larochelle, H., and Tarlow, D. Learning to execute programs with instruction pointer attention graph neural networks. Advances in Neural Information Processing Systems, 33:8626-8637, 2020.

[4]

Bieber, D., Goel, R., Zheng, D., Larochelle, H., and Tarlow, D. Static prediction of runtime errors by learning to execute programs with external resource descriptions. March 2022.

[5]

Brockschmidt, M., Chen, Y., Cook, B., Kohli, P., and Tarlow, D. Learning to decipher the heap for program verification. In ICML Workshop on Constructive Machine Learning (CML), 2015.

[6]

Brockschmidt, M., Chen, Y., Kohli, P., Krishna, S., and Tarlow, D. Learning shape analysis. In Static Analysis Symposium (SAS), 2017.

[7]

Cadar, C., Dunbar, D., Engler, D. R., et al. KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, volume 8, pp. 209-224, 2008.

Digital Library

[8]

Chen, M., Tworek, J., Jun, H., Yuan, Q., Ponde, H., Kaplan, J., Edwards, H., Burda, Y., Joseph, N., Brockman, G., Ray, A., Puri, R., Krueger, G., Petrov, M., Khlaaf, H., Sastry, G., Mishkin, P., Chan, B., Gray, S., Ryder, N., Pavlov, M., Power, A., Kaiser, L., Bavarian, M., Winter, C., Tillet, P., Such, F., Cummings, D., Plappert, M., Chantzis, F., Barnes, E., Herbert-Voss, A., Guss, W., Nichol, A., Babuschkin, I., Balaji, S., Jain, S., Carr, A., Leike, J., Achiam, J., Misra, V., Morikawa, E., Radford, A., Knight, M., Brundage, M., Murati, M., Mayer, K., Welinder, P., McGrew, B., Amodei, D., McCandlish, S., Sutskever, I., and Zaremba, W. Evaluating large language models trained on code. July 2021.

[9]

Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H. W., Sutton, C., Gehrmann, S., et al. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.

[10]

Claessen, K. and Hughes, J. QuickCheck: a lightweight tool for random testing of haskell programs. In Proceedings of the fifth ACM SIGPLAN international conference on Functional programming, ICFP '00, pp. 268-279, New York, NY, USA, September 2000. Association for Computing Machinery.

Digital Library

[11]

Cousot, P. and Cousot, R. Abstract interpretation. In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages - POPL '77, New York, New York, USA, 1977. ACM Press.

Digital Library

[12]

Dijkstra, E. W. Guarded commands, nondeterminacy and formal derivation of programs. Commun. ACM, 18(8): 453-457, 1975.

Digital Library

[13]

Ellis, K., Nye, M. I., Pu, Y., Sosa, F., Tenenbaum, J. B., and Solar-Lezama, A. Write, execute, assess: Program synthesis with a REPL. Advances in Neural Information Processing Systems {(NeurIPS)}, 2019.

[14]

Ernst, M. D., Perkins, J. H., Guo, P. J., McCamant, S., Pacheco, C., Tschantz, M. S., and Xiao, C. The daikon system for dynamic detection of likely invariants. Science of computer programming, 69(1-3):35-45, 2007.

[15]

Flanagan, C. and Leino, K. R. M. Houdini, an annotation assistant for esc/java. In FME 2001: Formal Methods for Increasing Software Productivity: International Symposium of Formal Methods Europe Berlin, Germany, March 12-16, 2001 Proceedings, pp. 500-517. Springer, 2001.

[16]

Galeotti, J. P., Furia, C. A., May, E., Fraser, G., and Zeller, A. Inferring loop invariants by mutation, dynamic analysis, and static checking. IEEE Transactions on Software Engineering, 41(10):1019-1037, 2015.

Digital Library

[17]

Garg, P., Löding, C., Madhusudan, P., and Neider, D. ICE: A robust framework for learning invariants. In Computer Aided Verification, Lecture notes in computer science, pp. 69-87. Springer International Publishing, Cham, 2014.

Digital Library

[18]

Garg, P., Neider, D., Madhusudan, P., and Roth, D. Learning invariants using decision trees and implication counter-examples. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '16, pp. 499-512, New York, NY, USA, January 2016. Association for Computing Machinery.

[19]

Hellendoorn, V. J., Devanbu, P. T., Polozov, O., and Marron, M. Are my invariants valid? a learning approach. March 2019.

[20]

Hoare, C. A. R. An axiomatic basis for computer programming. Commun. ACM, 12(10):576-580, October 1969.

Digital Library

[21]

Kaplan, J., McCandlish, S., Henighan, T. J., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., and Amodei, D. Scaling laws for neural language models. ArXiv, abs/2001.08361, 2020.

[22]

Li, Y., Choi, D., Chung, J., Kushman, N., Schrittwieser, J., Leblond, R., Eccles, T., Keeling, J., Gimeno, F., Dal Lago, A., Hubert, T., Choy, P., de Masson d'Autume, C., Babuschkin, I., Chen, X., Huang, P.-S., Welbl, J., Gowal, S., Cherepanov, A., Molloy, J., Mankowitz, D., Sutherland Robson, E., Kohli, P., de Freitas, N., Kavukcuoglu, K., and Vinyals, O. Competition-level code generation with alphacode. arXiv preprint arXiv:2203.07814, 2022.

[23]

Liu, L., Sheridan, D., Tuohy, W., and Vasudevan, S. A technique for test coverage closure using GoldMine. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 31(5): 790-803, May 2012.

Digital Library

[24]

Nguyen, T., Antonopoulos, T., Ruef, A., and Hicks, M. Counterexample-guided approach to finding numerical invariants. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 605-615, 2017.

Digital Library

[25]

Nye, M., Andreassen, A. J., Gur-Ari, G., Michalewski, H., Austin, J., Bieber, D., Dohan, D., Lewkowycz, A., Bosma, M., Luan, D., et al. Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114, 2021a.

[26]

Nye, M., Pu, Y., Bowers, M., Andreas, J., Tenenbaum, J. B., and Solar-Lezama, A. Representing partial programs with blended abstract semantics. In International Conference on Learning Representations (ICLR), 2021b.

[27]

Odena, A. and Sutton, C. Learning to represent programs with property signatures. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=rylHspEKPr.

[28]

Pacheco, C. and Ernst, M. D. Eclat: Automatic generation and classification of test inputs. In ECOOP 2005 - Object-Oriented Programming, pp. 504-527. Springer Berlin Heidelberg, 2005.

Digital Library

[29]

Pacheco, C., Lahiri, S. K., Ernst, M. D., and Ball, T. Feedback-directed random test generation. In ICSE 2007, Proceedings of the 29th International Conference on Software Engineering, pp. 75-84, Minneapolis, MN, USA, May 2007.

Digital Library

[30]

Padhi, S., Sharma, R., and Millstein, T. Loopinvgen: A loop invariant generator based on precondition inference. arXiv preprint arXiv:1707.02029, 2017.

[31]

Petersen, M. K. A. An evaluation of daikon: A dynamic invariant detector.

[32]

Polikarpova, N., Ciupa, I., and Meyer, B. A comparative study of programmer-written and automatically inferred contracts. In Proceedings of the eighteenth international symposium on Software testing and analysis, ISSTA '09, pp. 93-104, New York, NY, USA, July 2009. Association for Computing Machinery.

Digital Library

[33]

Ryan, G., Wong, J., Yao, J., Gu, R., and Jana, S. Cln2inv: learning loop invariants with continuous logic networks. arXiv preprint arXiv:1909.11542, 2019.

[34]

Sagdeo, P., Athavale, V., Kowshik, S., and Vasudevan, S. PRECIS: Inferring invariants using program path guided clustering. In 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), pp. 532-535, November 2011.

Digital Library

[35]

Sharma, R. and Aiken, A. From invariant checking to invariant inference using randomized search. Formal Methods in System Design, 48(3):235-256, 2016.

Digital Library

[36]

Sharma, R., Nori, A. V., and Aiken, A. Interpolants as classifiers. In International Conference on Computer Aided Verification, 2012.

Digital Library

[37]

Sharma, R., Gupta, S., Hariharan, B., Aiken, A., Liang, P., and Nori, A. V. A data driven approach for algebraic loop invariants. In Programming Languages and Systems, pp. 574-592. Springer Berlin Heidelberg, 2013a.

Digital Library

[38]

Sharma, R., Gupta, S., Hariharan, B., Aiken, A., Liang, P., and Nori, A. V. A data driven approach for algebraic loop invariants. In European Symposium on Programming, pp. 574-592. Springer, 2013b.

Digital Library

[39]

Sharma, R., Gupta, S., Hariharan, B., Aiken, A., and Nori, A. V. Verification as learning geometric concepts. In Static Analysis, Lecture notes in computer science, pp. 388-411. Springer Berlin Heidelberg, Berlin, Heidelberg, 2013c.

[40]

Shi, K., Dai, H., Ellis, K., and Sutton, C. CrossBeam: Learning to search in Bottom-Up program synthesis. In International Conference on Learning Representations (ICLR), 2022.

[41]

Si, X., Dai, H., Raghothaman, M., Naik, M., and Song, L. Learning loop invariants for program verification. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS'18, pp. 7762-7773, Red Hook, NY, USA, December 2018. Curran Associates Inc.

Digital Library

[42]

Si, X., Naik, A., Dai, H., Naik, M., and Song, L. Code2Inv: A deep learning framework for program verification. In Computer Aided Verification, Lecture notes in computer science, pp. 151-164. Springer International Publishing, Cham, 2020.

Digital Library

[43]

Tabachnyk, M. and Nikolov, S. Ml-enhanced code completion improves developer productivity. 2022. URL https://ai.googleblog.com/2022/07/ml-enhanced-code-completion-improves.html.

[44]

Zaremba, W. and Sutskever, I. Learning to execute. 2014.

[45]

Ziegler, A., Kalliamvakou, E., Li, X. A., Rice, A., Rifkin, D., Simister, S., Sittampalam, G., and Aftandilian, E. Productivity assessment of neural code completion. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming, MAPS 2022, pp. 21-29, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450392730.

Digital Library

[46]

Zohar, A. and Wolf, L. Automatic program synthesis of long programs with a learned garbage collector. September 2018.

Cited By

Li HHao YZhai YQian Z(2024)Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated ApproachProceedings of the ACM on Programming Languages10.1145/36498288:OOPSLA1(474-499)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649828

Recommendations

Rotation and translation invariants of Gaussian-Hermite moments

Geometric moment invariants are widely used in many fields of image analysis and pattern recognition since their first introduction by Hu in 1962. A few years ago, Flusser has proved how to find the independent and complete set of geometric moment ...
Radial Zernike Moment Invariants
CIT '04: Proceedings of the The Fourth International Conference on Computer and Information Technology

Radial Zernike moment invariants are special case from the complex Zernike moment invariants. The radial and angular dependence of Zernike moments is naturally separable which makes them very suitable features for achieving totational invarinces. The ...
Generic radial orthogonal moment invariants for invariant image recognition

As the variation of parameters in Jacobi polynomial, Jacobi-Fourier moments can form various types of orthogonal moments: Legendre-Fourier moments, Orthogonal Fourier-Mellin moments, Zernike moments, pseudo-Zernike moments, and so on. In this paper, we ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICML'23: Proceedings of the 40th International Conference on Machine Learning

July 2023

43479 pages

Copyright © 2023.

Publisher

JMLR.org

Publication History

Published: 23 July 2023

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li HHao YZhai YQian Z(2024)Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated ApproachProceedings of the ACM on Programming Languages10.1145/36498288:OOPSLA1(474-499)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649828

View Options

View options

Media

Figures

Other

Tables

View Table of Contents