On ML-Based Program Translation- Perils and Promises
On ML-Based Program Translation- Perils and Promises
1
Abstract—With the advent of new and advanced pro- . The Swedish bank Nordea also started their migration
arXiv:2302.10812v1 [cs.PL] 21 Feb 2023
gramming languages, it becomes imperative to migrate in 2020. While such migrations to newer PLs eventually
legacy software to new programming languages. Unsu- save money, the investment for the migration is poten-
pervised Machine Learning-based Program Translation
could play an essential role in such migration, even tially more costly because of PLs adhering to completely
without a sufficiently sizeable reliable corpus of parallel different programming philosophies (e.g., object-oriented
source code. However, these translators are far from vs. functional).
perfect due to their statistical nature. This work in- To address these issues, researchers propose automated
vestigates unsupervised program translators and where
and why they fail. With in-depth error analysis of
tools to convert programs written in one high-level lan-
such failures, we have identified that the cases where guage (e.g., Java) to another high-level language (e.g.,
such translators fail follow a few particular patterns. Python), commonly known as Transpiler or Transcom-
With this insight, we develop a rule-based program piler [7], [8]. Traditionally, Transpilers are rule-based
mutation engine, which pre-processes the input code translations [9]–[11]. A program written in the source
if the input follows specific patterns and post-process
the output if the output follows certain patterns. We
language is represented as an abstract syntax tree, which
show that our code processing tool, in conjunction with is then translated into the target language by hand-
the program translator, can form a hybrid program written rules, a.k.a templates. Such manual rule-driven
translator and significantly improve the state-of-the- translations are not scalable, especially in the presence
art. In the future, we envision an end-to-end program of external libraries and APIs. Furthermore, when the
translation tool where programming domain knowledge
can be embedded into an ML-based translation pipeline
two language structures are very different (e.g., Functional
using pre- and post-processing steps. language Haskell and Procedural Object-oriented language
Index Terms—Code generation, code translation, Java), writing conversion rules may not always be possible.
program transformation Finally, programs generated using such manual rules often
lack readability.
I. Introduction To overcome these issues, researchers proposed Machine
Learning (ML)-based transpilers where ML models trans-
In today’s software development ecosystem, Program- late between two high-level programming languages by
ming Languages (PL) are evolving rapidly, either as new learning the statistical alignments between the two lan-
languages or new features of existing languages. In the guages [12]–[15]. However, getting a meaningful, aligned
past few years, many languages such as Go, Rust, Swift, language corpus is challenging [16], [17]. To this end,
TypeScript, Python3, etc. have become popular. It is often Roziere et al. [8] proposed an unsupervised learning-based
challenging to keep pace with such evolution—developers approach, TransCoder, where alignments between PLs are
trained in one programming language find it hard to adapt learned through back-translation [18]. A program source
to the new paradigm [1]. language is first translated to a target language using
There exists a large body of legacy software written a forward-directional translator. The generated target
in old languages like COBOL, Fortran, etc. Maintain- program is then translated back to the source language
ing them is challenging as present-day developers would using a backward-direction translator. With joint opti-
need to have a good understanding of these outdated mization, these forward-backward translator pairs learn
languages [2]–[6]. Organizations have been investing a the alignments between the source and target languages in
lot to migrate their legacy code to newer programming their respective directions without requiring an explicitly
languages. For example, in 2012, the Commonwealth Bank aligned corpus.
of Australia spent 1 Billion Australian Dollars over the
subsequent five years to migrate its core banking platform 1 https://www.reuters.com/article/us-usa-banks-cobol/banks-
scramble-to-fix-old-systems-as-it-cowboys-ride-into-sunset-
† Equal contribution. idUSKBN17C0D8
Incorrect Translation Original Code
Original Code
int f_gold(int x){ Incorrect Translation
def f_gold(x): while((x!=0) && (x%10==0)){ def f_gold(arr1, arr2, ...):
while(x!=0): ... sorted=[0]*(m+n)
Transcoder Transcoder -1, 0, 0, 0, ...
... } ...
return(sm==x) return(sm==x)?1:0; return sorted[k-1]
}
Post-Processing Pre-Processing
(a) Post-Processing: The TransCoder generated code has (b) Pre-Processing: TransCoder cannot translate a Python
an extra incorrect x%10==0 condition; post-processing removed array parameter correctly. When pre-processing converts the
that. arr variable to list , the TransCoder translates correctly.
It turns out that unsupervised learning can outper- categorize such failures into two distinct categories – (a)
form all the previous approaches. However, since the semantic errors and (b) syntactic errors. With further
TransCoder-based model is entirely driven by the sta- investigation into each of these categories, we observe
tistical properties of the languages, it cannot guarantee that translations prone to semantic errors follow specific
the syntactical or semantic accuracy of the generated human-observable patterns and are amenable to easy post-
code. Figure 1 shows a motivating example. While the processing corroborating hypothesis 1 (see Figure 1a as
TransCoder model almost correctly translated the input an example). In contrast, when models make a syntac-
code in Figure 1a, the translated Java method contains tically invalid translation, we observe that the inputs
an additional conditional clause, x % 10 == 0 . A knowl- follow a few specific patterns and are fixable with input
edgeable developer can further mutate almost correctly program transformation through pre-processing (hypoth-
translated code to obtain greater accuracy, especially if esis 2). Figure 1b shows an example.
common patterns of mistakes the model makes can be ML-based code translation models come with enor-
identified. mous promises. However, without syntactic or semantic
Hypothesis 1. While “unsupervised” translators are not guidance, we cannot exploit their full potential. As a
perfect, their results can be post-processed if we know the proof-of-concept, we incorporate such guidance with a
model’s common patterns of mistakes (i.e., “blind spots”). rule-based transformer that can pre-process and/or post-
process the source code; these transformers can be coupled
In addition, since these models are trained in an ad
with TransCoder to build a hybrid program translator,
hoc, unsupervised way, they do not explicitly learn the
a.k.a. transpiler. Our initial prototype can improve the
syntactic and semantic alignments across language com-
vanilla ML-based TransCoder by 86% for Java to Python
ponents. For instance, the while loop is semantically
translation and 50% for Python to Java translation.
equivalent in Java and Python. However, for loops in
This indicates that guiding the ML model with program-
these two languages are semantically different—Java for
property-aware techniques has significant potential in pro-
loop construction often contains an updated expression
gram translation.
for updating the loop control variable; in Python’s for
loop, such capacity is limited. Thus, TransCoder often II. Study Design
fails to translate a Java for loop to a Python one.
TransCoder is a state-of-the-art and popular model that
Hypothesis 2. Once we identify the model’s inabilities, we accomplishes programming language translation using un-
can systematically mutate the input code to bypass the
common error-producing patterns. supervised learning fueled by a GeeksforGeeks unlabelled
dataset. It is a gigantic transformer-based model trained
In this pilot study, we aim to understand the common on a public Github corpus repository of roughly 2.8 million
pitfalls of TransCoder and how we can improve them. open-source repositories. Yet, the reported accuracy is still
For this purpose, we chose a large open-source unsuper- suboptimal; TransCoder’s performance is evaluated via a
vised program translation model, TransCoder, released metric known as computational accuracy, or the ability
by FaceBook AI [8], which is trained on 128M GitHub of a translated program to produce the same output as
repositories and has recently gained much attention. We the source code when run. The computational accuracy
then performed a rigorous manual study to find common of TransCoder’s Python to Java translation is 68.7%, and
areas where TransCoder fails to translate correctly. We 56.1% the other way around.
To understand what kind of errors TransCoder com- out of 50 Java to Python (J2P) and 19 out of 50 Python
monly makes, we dug deeper into the TransCoder- to Java (P2J) examples suffer from this problem.
generated translations using 100 examples. Two of the Fixes. Once these focal methods are translated in iso-
authors went through the code examples and noted their lation (without the additional context), the TransCoder
findings which were verified by another two authors. For generates the correct output. Figure 2 Row 1 shows an
each case, all of the authors reached a consensus about example. While focal method f gold is called and the
the type of potential error. To this end, we identify some main method is still in the context, TransCoder could not
common error patterns TransCoder is making. Leveraging generate any meaningful translation. However, when we
these findings, we propose a hybrid technique combining remove the additional contexts, the translation accuracy
machine learning and traditional rule-based solutions that significantly improves.
can give an end-to-end solution to the code translation In the rest of the paper, we treat the TransCoder
problem. as a function translator. The translation errors observed
Dataset. Facebook AI’s Github page [19] provided ex- will henceforth be mainly errors that occurred when we
tensive testing data for the TransCoder model taken from singularly translated the functions using the TransCoder.
the GeeksForGeeks dataset. The testing dataset provided 2. Loop Conversion. Vanilla TransCoder performs
is comprised of around 280 files each in Python, Java, poorly while translating complex for loop to while loop,
and C++. Each file has a method, f gold(), which is to especially for Java to Python translation. As Java for
be translated, along with a main method containing test loops generally allow more functionalities than Python
cases. We randomly sampled 50 test cases for both Java for loops (e.g., different increment of the loop variables,
to Python and Python to Java translation analysis. For more variables, more conditions), the TransCoder model
each test case, we used the TransCoder to translate each has difficulty translating complex for loops from Java to
file, analyzed the progress of each translation, and marked Python. Complex for loops appeared in 6 out of 50 sam-
what errors were similar in multiple file translations and ples, and all of them could not produce correct outputs,
potential solutions. where 4 out of the 6 produced garbage translation.
Fixes. We hypothesize that it would be beneficial to con-
III. Preliminary Results vert the for loops to while loops before passing the input
TABLE I: Common Error Patterns found in to TransCoder, as the latter is syntactically equivalent
TransCoder in Python and Java. Thus, as a pre-processing step, we
performed semantic preserving transformation to covert
Java to Python Python to Java for to while . Such pre-processing significantly improved
(J2P) (P2J)
the translation of all 6 incorrect cases. Figure 2 second
1. Additional Context 18% 38%
2. Loop Conversion 12% 0%
row shows an example.
3. Type Sensitivity 38% 4% 3. Type Sensitivity. We find that TransCoder can be
4. Extra Constraints 0% 50% sensitive to certain types. For example, 19 out of 50
5. Miscellaneous Errors 14% 16%
examples J2P examples contain an array as a parameter.
(Mostly) Correct 22% 18% TransCoder fails to translate all these cases, as shown in
the third-row of Figure 2. For P2J as well, (see Figure 1b),
Based on this study, we identify 4 different categories of when the input focal method contains two or more pa-
errors. Table I shows the distribution. In comparison to rameters with names arr , TransCoder fails to translate
Java to Python translation (J2P), Python to Java (P2J) them. Note that, since Python is a dynamically typed
has a slightly higher rate of success—22% vs. 18%. This language, we have to rely on the variable names to infer
section discusses the common error patterns and potential their types. However, the corresponding ground truth Java
ways to fix them using template-based pre-processing and code confirms the intended type is indeed an array.
post-processing approaches. These percentages are calcu- Fixes. We explore a preprocessing step where without
lated by taking the percentage of the 50 test cases in both changing the code’s semantics we tried to use equivalent
J2P and P2J that display the mentioned errors. Figure 2 types or classes. For instance, in the above case, we change
illustrate the errors and plausible solutions, and the errors all the array parameter references in the Java code to a
are described in greater detail below. List of the equivalent data type, as the Python translation
1. Additional Context. The goal of the model is to of a Java array versus a Java List is identical. Note that
accurately translate one method, typically called the focal we can not use the exact same data type when converting
method. However, the focal method is often surrounded an array to List. Instead, we must use the wrapper class
by a main method and test cases. We call these extra data type (int to Integer, double to Double, etc). Such type
surroundings ’additional context’. TransCoder tends to transformation in the pre-processing helped us to improve
get confused between arguments inside and outside the TransCoder’s performances across all the 19 cases.
method, and will sometimes translate the additional con- 4. Generating Extra Constraints. The most prominent
text as well, resulting in incorrect or unreadable code. 9 issue for Python to Java translation is generating extrane-
Original Translation Updated Translation (Pre-/Post-Processing)
def main():
1. Additional Context
... hmap = new HashMap<>(); int f_gold(List<Integer> arr,int n){ def f_gold ( arr , n ) :
... ... hmap = new HashMap<>(); hmap = { }
for(Integer a:hmap.keySet()){ ... ...
No meaningful translation,
if (hmap.get(a)%2!=0) for (Integer a:hmap.keySet()){ for a in hmap.keys():
only series of imports
return a ; if (hmap.get(a)%2!=0) if hmap[a] % 2 != 0 :
} return a ; return a
return - 1 ; } return - 1
} return - 1 ;
}
// Post-processing:
4. Extra Constraint
ous logical operators to if and else if and while state- To evaluate the effectiveness of each mutation, we first
ments. Out of 50 examples, 25 had such issues. Although determined of which sampled test cases each mutation
such additional logical operators are syntactically valid, was applicable to. After translating the both original
they can potentially change code semantics. The last row source code and the mutated source code, we classified the
of Figure 2 is an example. mutation as a success or fail for each test case. If multiple
Fixes. As a post-processing step, we discard all the logical mutations were applicable to any test case, we would
constraints that do not appear in the source version. apply all possible combinations to ensure successes in each
This is due to the observation that although the model translation. The rate of success of a specific mutation, or
appends logical constraints, it never modifies the original rule, is computed as the number of successes divided by
conditions. the number of cases it was applicable to. Each of our
Overall Results. The performance of each mutation is identified mutations have a 100% success rate, though
measured by the rate of success. We classify ”success” in there are errors for which we have not yet discovered a
two cases: viable mutation for yet (Miscellaneous Errors in Table 1).
1. If a translated program does not compile, a success IV. Related Works
is when the translation of the program after applying Multiple previous studies have investigated the possi-
mutations compiles. bility of programming language translation through ma-
2. If a translated program does compile, but with error, chine learning. However, almost all studies rely on super-
a success is when the translation of the program after vised learning [17], [20]–[24]. This approach is unrealistic,
applying mutations runs more similarly to the original though accurate, as it is difficult to accumulate a high total
program. More specifically, if the translated code can be of labeled, correctly translated, datasets [17].
more easily interpreted to have the same functionality While it is difficult to come across labeled datasets,
as the source code, we would classify the mutation as a some researchers have found it effective to train their
success. model based on a technique called back-translation [8],
[16]. Being unsupervised, the capability of these models are in such cases, the rule-based approach may need to provide
not limited by the quantity of the annotated parallel data, more guidance.
making them state-of-the-art for program translation. In To this end, we envision building a scalable, modular,
this work, we case study one such model, TransCoder [8]. end-to-end system combining pre-processing, translation,
Other research has delved into the application of SMT and post-processing steps. We also intend to investigate
(statistical machine translation) [12], [14], [25] models in the usage of code editing models [22], [33] as pre-processing
the translation of programming languages. These studies and program repair tools [29], [34], [35] as post-processing
have also reached conclusions similar to this project, that a steps for better generalization.
majority of test cases have errors, but only need small fixes
to produce correct translations. These models can also be Acknowledgement
improved in a more program-analysis-oriented approach, This work is supported in part by NSF grants SHF-
as our techniques demonstrate as well [14]. 2107405, SHF-1845893, IIS-2040961, IBM, and VMWare.
Researchers also proposed translation models for in- Any opinions, findings, conclusions, or recommendations
language code transformation for syntactic repair [26], expressed herein are those of the authors and do not
[27], semantic program repair [28]–[30], refactoring [31], necessarily reflect those of the US Government, NSF, IBM
[32], etc. More recently, researchers have been proposing or VMWare.
general purpose code transformation models “pre-trained”
from developer-written code transformation collected from References
GitHub [33], or rule-based transformations [22]. In the [1] L. A. Meyerovich and A. S. Rabkin, “Empirical analysis of
future, we aim at investigating both the syntactic and programming language adoption,” in Proceedings of the 2013
ACM SIGPLAN international conference on Object oriented
semantic repair models as our pre-processing and post- programming systems languages & applications, 2013, pp. 1–18.
processing components. [2] R. J. Kizior, D. Carr, and P. Halpern, “Does cobol have a
future?” in Proc. Information Systems Education Conf, vol. 17,
V. Conclusion & Future Work no. 126, 2000.
[3] N. Stern, COBOL for the 21st Century. John Wiley & Sons,
Paper Summary. In this paper, we discuss the pit- Inc., 2007.
falls of unsupervised program translators and present [4] H. M. Sneed, “Migrating from cobol to java,” in 2010 IEEE
International Conference on Software Maintenance. IEEE,
the potential of program-property-aware rules that can 2010, pp. 1–7.
guide the ML-based translation as pre-/post- processing [5] J. Pu, Z. Zhang, J. Kang, Y. Xu, and H. Yang, “Using aspect
steps. We developed a proof-of-concept in-language pro- orientation in understanding legacy cobol code,” in 31st Annual
International Computer Software and Applications Conference
gram transformer for pre-processing the input and post- (COMPSAC 2007), vol. 2. IEEE, 2007, pp. 385–390.
processing the output of TransCoder. We show that a [6] N. Wilde, M. Buckellew, H. Page, and V. Rajlich, “A case study
simple rule-based in-language program transformer can of feature location in unstructured legacy fortran code,” in Pro-
significantly outperform program translation performance. ceedings Fifth European Conference on Software Maintenance
and Reengineering. IEEE, 2001, pp. 68–76.
Our preliminary results, along with detailed instruc- [7] R. Kulkarni, A. Chavan, and A. Hardikar, “Transpiler and it’s
tions to replicate each mutation, are publicly available advantages,” International Journal of Computer Science and
at https://github.com/kzh23/Replication-Package-ICSE- Information Technologies, vol. 6, no. 2, pp. 1629–1631, 2015.
[8] B. Roziere, M.-A. Lachaux, L. Chanussot, and G. Lample, “Un-
NIER-2023-Unsupervised-ML. While the ML-based trans- supervised translation of programming languages,” Advances
lator relies on statistical knowledge embedded in “big in Neural Information Processing Systems, vol. 33, pp. 20 601–
data”, we propose to embed programming domain knowl- 20 611, 2020.
[9] “Babel is a javascript compiler,” https://babeljs.io/, accessed:
edge into the translation pipeline. 2010-10-12.
Future Work. This paper serves as an initial attempt [10] K. Kimura, A. Sekiguchi, S. Choudhary, and T. Uehara, “A
toward combining ML-based program translation and pro- javascript transpiler for escaping from complicated usage of
cloud services and apis,” in 2018 25th Asia-Pacific Software
gram analysis-based program mutation. We aim to build Engineering Conference (APSEC). IEEE, 2018, pp. 69–78.
more sophisticated and automated techniques for program [11] “2to3 — automated python 2 to 3 code translation,” https://
transformation in the future. As evidenced by our ini- docs.python.org/3/library/2to3.html, accessed: 2010-10-12.
[12] K. Aggarwal, M. Salameh, and A. Hindle, “Using machine
tial results, guiding the ML-based tools with program- translation for converting python 2 to python 3 code,” PeerJ
property-aware rules has immense potential in program PrePrints, Tech. Rep., 2015.
translation. In the future, we will leverage how smartly [13] G. Lample, M. Ott, A. Conneau, L. Denoyer, and M. Ranzato,
“Phrase-based & neural unsupervised machine translation,”
incorporate such guidance in the ML pipelines. For in- arXiv preprint arXiv:1804.07755, 2018.
stance, currently, the vanilla TransCoder can only trans- [14] A. T. Nguyen, T. T. Nguyen, and T. N. Nguyen, “Lexical sta-
late methods in isolation. Such limitations will hinder the tistical machine translation for language migration,” in Proceed-
ings of the 2013 9th Joint Meeting on Foundations of Software
adaptability of the proposed techniques in real life, where Engineering, 2013, pp. 651–654.
an entire project written in legacy language needs to be [15] Y. Oda, H. Fudaba, G. Neubig, H. Hata, S. Sakti, T. Toda,
translated. We will further study the applicability of the and S. Nakamura, “Learning to generate pseudo-code from
source code using statistical machine translation,” in 2015 30th
proposed technique in low-resourced languages where we IEEE/ACM International Conference on Automated Software
will not get enough sample data for the training ML model; Engineering (ASE). IEEE, 2015, pp. 574–584.
[16] W. U. Ahmad, S. Chakraborty, B. Ray, and K.-W. Chang, [25] S. Karaivanov, V. Raychev, and M. Vechev, “Phrase-based sta-
“Summarize and generate to back-translate: Unsupervised tistical translation of programming languages,” in Proceedings
translation of programming languages,” arXiv preprint of the 2014 ACM International Symposium on New Ideas, New
arXiv:2205.11116, 2022. Paradigms, and Reflections on Programming & Software, 2014,
[17] X. Chen, C. Liu, and D. Song, “Tree-to-tree neural networks for pp. 173–184.
program translation,” Advances in neural information process- [26] T. Ahmed, N. R. Ledesma, and P. Devanbu, “Synfix: Automat-
ing systems, vol. 31, 2018. ically fixing syntax errors using compiler diagnostics,” arXiv
[18] S. Edunov, M. Ott, M. Auli, and D. Grangier, “Understanding preprint arXiv:2104.14671, 2021.
back-translation at scale,” arXiv preprint arXiv:1808.09381, [27] ——, “Synshine: Improved fixing of syntax errors,” IEEE Trans-
2018. actions on Software Engineering, 2022.
[19] Facebookresearch, “Facebookresearch/transcoder: Pub- [28] S. Chakraborty, Y. Ding, M. Allamanis, and B. Ray, “Codit:
lic release of the transcoder research project Code editing with tree-based neural models,” IEEE Transac-
https://arxiv.org/pdf/2006.03511.pdf.” [Online]. Available: tions on Software Engineering, pp. 1–1, 2020.
https://github.com/facebookresearch/TransCoder [29] Z. Chen, S. Kommrusch, M. Tufano, L.-N. Pouchet,
[20] W. Ahmad, S. Chakraborty, B. Ray, and K.-W. Chang, “Unified D. Poshyvanyk, and M. Monperrus, “Sequencer: Sequence-
pre-training for program understanding and generation,” in Pro- to-sequence learning for end-to-end program repair,” IEEE
ceedings of the 2021 Conference of the North American Chapter Transactions on Software Engineering, vol. 47, no. 9, pp.
of the Association for Computational Linguistics: Human 1943–1959, 2019. [Online]. Available: https://www.cs.wm.edu/
Language Technologies. Online: Association for Computational ∼denys/pubs/seq2seq4repair TSE cameraready.pdf
Linguistics, Jun. 2021, pp. 2655–2668. [Online]. Available: [30] M. Tufano, C. Watson, G. Bavota, M. D. Penta, M. White,
https://www.aclweb.org/anthology/2021.naacl-main.211 and D. Poshyvanyk, “An empirical study on learning bug-
[21] S. Lu, D. Guo, S. Ren, J. Huang, A. Svyatkovskiy, A. Blanco, fixing patches in the wild via neural machine translation,”
C. B. Clement, D. Drain, D. Jiang, D. Tang, G. Li, L. Zhou, ACM Transactions on Software Engineering and Methodology
L. Shou, L. Zhou, M. Tufano, M. Gong, M. Zhou, N. Duan, (TOSEM), vol. 28, no. 4, pp. 1–29, 2019.
N. Sundaresan, S. K. Deng, S. Fu, and S. Liu, “Codexglue: [31] M. Aniche, E. Maziero, R. Durelli, and V. Durelli, “The ef-
A machine learning benchmark dataset for code understanding fectiveness of supervised machine learning algorithms in pre-
and generation,” CoRR, vol. abs/2102.04664, 2021. dicting software refactoring,” IEEE Transactions on Software
[22] S. Chakraborty, T. Ahmed, Y. Ding, P. Devanbu, and B. Ray, Engineering, 2020.
“Natgen: Generative pre-training by” naturalizing” source [32] A. M. Sheneamer, “An automatic advisor for refactoring soft-
code,” in 2022 The ACM Joint European Software Engineering ware clones based on machine learning,” IEEE Access, vol. 8,
Conference and Symposium on the Foundations of Software pp. 124 978–124 988, 2020.
Engineering (ESEC/FSE). ACM, 2022. [33] J. Zhang, S. Panthaplackel, P. Nie, J. J. Li, and M. Gligoric,
[23] Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, “Coditt5: Pretraining for source code and natural language
L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, “CodeBERT: editing,” arXiv preprint arXiv:2208.05446, 2022.
A pre-trained model for programming and natural languages,” [34] H. Ye, M. Martinez, and M. Monperrus, “Neural program repair
in Findings of the Association for Computational Linguistics: with execution-based backpropagation,” in Proceedings of the
EMNLP 2020. Online: Association for Computational Lin- 44th International Conference on Software Engineering, 2022,
guistics, Nov. 2020, pp. 1536–1547. pp. 1506–1518.
[24] D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. LIU, L. Zhou, [35] M. Yasunaga and P. Liang, “Break-it-fix-it: Unsupervised
N. Duan, A. Svyatkovskiy, S. Fu, M. Tufano, S. K. Deng, learning for program repair,” in International Conference on
C. Clement, D. Drain, N. Sundaresan, J. Yin, D. Jiang, and Machine Learning. PMLR, 2021, pp. 11 941–11 952. [Online].
M. Zhou, “Graphcode{bert}: Pre-training code representations Available: https://arxiv.org/pdf/2106.06600.pdf
with data flow,” in International Conference on Learning Rep-
resentations, 2021.