Context-Aware Parse Trees

Ye, Fangke; Zhou, Shengtian; Venkat, Anand; Marcus, Ryan; Petersen, Paul; Tithi, Jesmin Jahan; Mattson, Tim; Kraska, Tim; Dubey, Pradeep; Sarkar, Vivek; Gottschlich, Justin

Computer Science > Programming Languages

arXiv:2003.11118 (cs)

[Submitted on 24 Mar 2020]

Title:Context-Aware Parse Trees

Authors:Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Paul Petersen, Jesmin Jahan Tithi, Tim Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich

View PDF

Abstract:The simplified parse tree (SPT) presented in Aroma, a state-of-the-art code recommendation system, is a tree-structured representation used to infer code semantics by capturing program \emph{structure} rather than program \emph{syntax}. This is a departure from the classical abstract syntax tree, which is principally driven by programming language syntax. While we believe a semantics-driven representation is desirable, the specifics of an SPT's construction can impact its performance. We analyze these nuances and present a new tree structure, heavily influenced by Aroma's SPT, called a \emph{context-aware parse tree} (CAPT). CAPT enhances SPT by providing a richer level of semantic representation. Specifically, CAPT provides additional binding support for language-specific techniques for adding semantically-salient features, and language-agnostic techniques for removing syntactically-present but semantically-irrelevant features. Our research quantitatively demonstrates the value of our proposed semantically-salient features, enabling a specific CAPT configuration to be 39\% more accurate than SPT across the 48,610 programs we analyzed.

Subjects:	Programming Languages (cs.PL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2003.11118 [cs.PL]
	(or arXiv:2003.11118v1 [cs.PL] for this version)
	https://doi.org/10.48550/arXiv.2003.11118

Submission history

From: Justin Gottschlich [view email]
[v1] Tue, 24 Mar 2020 21:19:14 UTC (2,904 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.PL

< prev | next >

new | recent | 2020-03

Change to browse by:

cs
cs.AI

References & Citations

DBLP - CS Bibliography

listing | bibtex

Anand Venkat
Ryan Marcus
Tim Mattson
Tim Kraska
Pradeep Dubey

…

export BibTeX citation

Computer Science > Programming Languages

Title:Context-Aware Parse Trees

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Programming Languages

Title:Context-Aware Parse Trees

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators