Planning with Large Language Models for Code Generation

Zhang, Shun; Chen, Zhenfang; Shen, Yikang; Ding, Mingyu; Tenenbaum, Joshua B.; Gan, Chuang

Computer Science > Machine Learning

arXiv:2303.05510 (cs)

[Submitted on 9 Mar 2023]

Title:Planning with Large Language Models for Code Generation

Authors:Shun Zhang, Zhenfang Chen, Yikang Shen, Mingyu Ding, Joshua B. Tenenbaum, Chuang Gan

View PDF

Abstract:Existing large language model-based code generation pipelines typically use beam search or sampling algorithms during the decoding process. Although the programs they generate achieve high token-matching-based scores, they often fail to compile or generate incorrect outputs. The main reason is that conventional Transformer decoding algorithms may not be the best choice for code generation. In this work, we propose a novel Transformer decoding algorithm, Planning-Guided Transformer Decoding (PG-TD), that uses a planning algorithm to do lookahead search and guide the Transformer to generate better programs. Specifically, instead of simply optimizing the likelihood of the generated sequences, the Transformer makes use of a planner to generate candidate programs and test them on public test cases. The Transformer can therefore make more informed decisions and generate tokens that will eventually lead to higher-quality programs. We also design a mechanism that shares information between the Transformer and the planner to make our algorithm computationally efficient. We empirically evaluate our framework with several large language models as backbones on public coding challenge benchmarks, showing that 1) it can generate programs that consistently achieve higher performance compared with competing baseline methods; 2) it enables controllable code generation, such as concise codes and highly-commented codes by optimizing modified objective.

Comments:	ICLR 2023. Project page:this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Programming Languages (cs.PL)
Cite as:	arXiv:2303.05510 [cs.LG]
	(or arXiv:2303.05510v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2303.05510

Submission history

From: Chuang Gan [view email]
[v1] Thu, 9 Mar 2023 18:59:47 UTC (621 KB)

Computer Science > Machine Learning

Title:Planning with Large Language Models for Code Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Planning with Large Language Models for Code Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators