Measuring The Impact Of Programming Language Distribution

Orlanski, Gabriel; Xiao, Kefan; Garcia, Xavier; Hui, Jeffrey; Howland, Joshua; Malmaud, Jonathan; Austin, Jacob; Singh, Rishabh; Catasta, Michele

Computer Science > Machine Learning

arXiv:2302.01973 (cs)

[Submitted on 3 Feb 2023 (v1), last revised 24 May 2023 (this version, v3)]

Title:Measuring The Impact Of Programming Language Distribution

Authors:Gabriel Orlanski, Kefan Xiao, Xavier Garcia, Jeffrey Hui, Joshua Howland, Jonathan Malmaud, Jacob Austin, Rishabh Singh, Michele Catasta

View PDF

Abstract:Current benchmarks for evaluating neural code models focus on only a small subset of programming languages, excluding many popular languages such as Go or Rust. To ameliorate this issue, we present the BabelCode framework for execution-based evaluation of any benchmark in any language. BabelCode enables new investigations into the qualitative performance of models' memory, runtime, and individual test case results. Additionally, we present a new code translation dataset called Translating Python Programming Puzzles (TP3) from the Python Programming Puzzles (Schuster et al. 2021) benchmark that involves translating expert-level python functions to any language. With both BabelCode and the TP3 benchmark, we investigate if balancing the distributions of 14 languages in a training dataset improves a large language model's performance on low-resource languages. Training a model on a balanced corpus results in, on average, 12.34% higher $pass@k$ across all tasks and languages compared to the baseline. We find that this strategy achieves 66.48% better $pass@k$ on low-resource languages at the cost of only a 12.94% decrease to high-resource languages. In our three translation tasks, this strategy yields, on average, 30.77% better low-resource $pass@k$ while having 19.58% worse high-resource $pass@k$.

Comments:	Accepted to ICML 2023, Code and data release: this https URL
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Programming Languages (cs.PL)
Cite as:	arXiv:2302.01973 [cs.LG]
	(or arXiv:2302.01973v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2302.01973

Submission history

From: Gabriel Orlanski [view email]
[v1] Fri, 3 Feb 2023 19:47:22 UTC (1,488 KB)
[v2] Wed, 15 Mar 2023 14:36:49 UTC (1,523 KB)
[v3] Wed, 24 May 2023 16:20:33 UTC (1,568 KB)

Computer Science > Machine Learning

Title:Measuring The Impact Of Programming Language Distribution

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Measuring The Impact Of Programming Language Distribution

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators