Evaluating large language models trained on code

…, H Edwards, Y Burda, N Joseph, G Brockman… - arXiv preprint arXiv …, 2021 - arxiv.org
Edwards 1 Yuri Burda 1 Nicholas Joseph 2 Greg Brockman 1 Alex Ray 1 Raul Puri 1
Gretchen Krueger 1 Michael Petrov 1 Heidy Khlaaf 3 Girish Sastry 1 Pamela Mishkin 1 Brooke …

Few-shot semantic parsing with language models trained on code

R Shin, B Van Durme - arXiv preprint arXiv:2112.08696, 2021 - arxiv.org
Large language models can perform semantic parsing with little training data, when prompted
with in-context examples. It has been shown that this can be improved by formulating the …

Benchclamp: A benchmark for evaluating language models on semantic parsing

S Roy, S Thomson, T Chen, R Shin, A Pauls… - arXiv preprint arXiv …, 2022 - arxiv.org
We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing,
which produces semantic outputs based on the analysis of input text through constrained …

Automap: Towards ergonomic automated parallelism for ML models

M Schaarschmidt, D Grewe, D Vytiniotis… - arXiv preprint arXiv …, 2021 - arxiv.org
The rapid rise in demand for training large neural network architectures has brought into
focus the need for partitioning strategies, for example by using data, model, or pipeline …

CodeInsight: A Curated Dataset of Practical Coding Solutions from Stack Overflow

N Beau, B Crabbé - Findings of the Association for Computational …, 2024 - aclanthology.org
We introduce a novel dataset tailored for code generation, aimed at aiding developers in
common tasks. Our dataset provides examples that include a clarified intent, code snippets …

Solving quantitative reasoning problems with language models

…, A Andreassen, D Dohan… - Advances in …, 2022 - proceedings.neurips.cc
Abstract Language models have achieved remarkable performance on a wide range of tasks
that require natural language understanding. Nevertheless, state-of-the-art models have …

Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers

Q Chen, W Wang, Q Zhang, S Zheng, S Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
The Transformer architecture has significantly advanced deep learning, particularly in natural
language processing, by effectively managing long-range dependencies. However, as the …

Wizardcoder: Empowering code large language models with evol-instruct

…, X Geng, W Hu, C Tao, J Ma, Q Lin, D Jiang - arXiv preprint arXiv …, 2023 - arxiv.org
Code Large Language Models (Code LLMs), such as StarCoder, have demonstrated
exceptional performance in code-related tasks. However, most existing models are solely pre-…

[PDF][PDF] Autoformalization for Neural Theorem Proving

Y Wu, A Jiang, W Li, MN Rabe, C Staats, M Jamnik… - aitp-conference.org
… [1] Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira
Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, …

Qwen2 technical report

…, J Bai, J He, J Lin, K Dang, K Lu, K Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
This report introduces the Qwen2 series, the latest addition to our large language models
and large multimodal models. We release a comprehensive suite of foundational and …