0% found this document useful (0 votes)
26 views

Implementing rust borrow checker in c

Almost implementing rust borrow checker in c

Uploaded by

10IN ALL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Implementing rust borrow checker in c

Almost implementing rust borrow checker in c

Uploaded by

10IN ALL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Foundations for a Rust-Like Borrow Checker for C

Tiago Silva João Bispo Tiago Carvalho


University of Porto University of Porto / INESC TEC Polytechnic Institute of Porto, School
Porto, Portugal Porto, Portugal of Engineering / INESC TEC
[email protected] [email protected] Porto, Portugal
[email protected]

Abstract 1 Introduction
Memory safety issues in C are the origin of various vul- Performance and flexibility regarding the representation and
nerabilities that can compromise a program’s correctness manipulation of memory are some of the most prized aspects
or safety from attacks. We propose a different approach to of the C programming language. The latter, however, con-
tackle memory safety, the replication of Rust’s Mid-level In- stitutes a dangerous double-edged sword, as they can be a
termediate Representation (MIR) Borrow Checker, through source of various bugs [26] that result in memory safety vul-
the usage of static analysis and successive source-to-source nerabilities that are almost exclusively found in C/C++ [31].
code transformations, to be composed upstream of the com- Previous research into this topic has proposed techniques
piler, thus ensuring maximal compatibility with most build such as the application of static analysers, the insertion of
systems. This allows us to approximate a subset of C to Rust’s runtime memory checks, or the introduction of special an-
core concepts, applying the memory safety guarantees of notations leading to new dialects such as Cyclone [33].
the rustc compiler to C. In this work, we present a survey Historically, multiple approaches have been proposed to
of Rust’s efforts towards ensuring memory safety, and de- detect and correct these safety issues over the years. One
scribe the theoretical basis for a C borrow checker, alongside of these solutions was the creation of a type-safe language
a proof-of-concept that was developed to demonstrate its whose compiler guaranteed memory and concurrency safety,
potential. This prototype correctly identified violations of which ultimately became the basis for the Rust programming
the ownership and aliasing rules, and accurately reported language [14].
each error with a level of detail comparable to that of the Although new projects could adopt Rust to become more
rustc compiler. secure, the C programming language remains the primary
choice for safety-critical projects restricted by formally veri-
CCS Concepts: • Software and its engineering → Soft- fied compilers, programs targeting embedded systems with
ware safety; Preprocessors; Allocation / deallocation poor Rust compilation toolchains, or even legacy codebases.
strategies. The process of automatically converting from C codebases
to Rust remains an arduous process which is the target of
Keywords: C, Rust, Source-to-Source, Memory Safety, Static various proposals, such as Ling et al. [17].
analysis, Borrow checker, Lifetimes, Ownership, Transpiler, In this work, we present an approach to replicate Rust’s
Code transformations borrow checker in C, allowing for memory safety to be en-
forced at compile-time, without the need for heavy run-
ACM Reference Format: time overhead. Our source-to-source approach, based on the
Tiago Silva, João Bispo, and Tiago Carvalho. 2024. Foundations for Clava [3] compiler, resulted in a compiler agnostic solution,
a Rust-Like Borrow Checker for C. In Proceedings of the 25th ACM allowing it to be easily integrated into existing build systems.
SIGPLAN/SIGBED International Conference on Languages, Compil- This proposal laid the required groundwork for a fully-
ers, and Tools for Embedded Systems (LCTES ’24), June 24, 2024, fledged C borrow checker, based on a survey of Rust’s design,
Copenhagen, Denmark. ACM, New York, NY, USA, 11 pages. https:
functionalities and internals. We highlight the most relevant
//doi.org/10.1145/3652032.3657579
algorithms that safeguard against dangerous pointer aliasing,
and present an early proof-of-concept prototype to demon-
strate the viability of a C borrow checker.
During the background section, we present a small intro-
spection into Rust’s core principles and foundations, from
single ownership and aliasing, to a brief introduction of the
This work is licensed under a Creative Commons Attribution 4.0 Interna- responsibilities of its borrow checker. In the following sec-
tional License.
tion, we catalogue the work performed on memory safety,
LCTES ’24, June 24, 2024, Copenhagen, Denmark and identify how our work distinguishes itself from previous
© 2024 Copyright held by the owner/author(s). research. Afterwards, we present the theoretical foundations
ACM ISBN 979-8-4007-0616-5/24/06 for a C borrow checker, with a focus on automatic memory
https://doi.org/10.1145/3652032.3657579

155
LCTES ’24, June 24, 2024, Copenhagen, Denmark Tiago Silva, João Bispo, and Tiago Carvalho

management through the use of annotations and #pragma This can be extended to move values across functions,
directives, alongside its possible limitations. In the imple- where the ownership of a value can be transferred into the
mentation section, we describe the internals of our proof-of- function, in which it may be modified, and then possibly
concept, as well as present the results obtained through its returned back to the caller. As most large data types are
testing on various synthetic programs. Finally, we present stored in the heap, this operation is extremely efficient, as it
our conclusions and delve into the future work required to only requires the transfer of the stack-allocated data, such
both polish and expand the proposed C borrow checker. as a pointer and some bounds information.
In order to simplify the syntax and improve the semantics
2 Background of Rust, the mechanism for borrowing was introduced to
transfer temporary ownership of variables, a mechanism
2.1 Ownership and Borrowing in Rust
comparable to C++ references. Unlike traditional pointers,
Although Rust has no official pointer aliasing model, there however, references must ensure to "point to a valid value
have been proposals to formalize it by Jung [13] [12]. Rust of a particular type for the life of that reference" [14], that is,
relies on a strict usage of pointer aliasing, in particular regard- they must always point to a valid object or function.
ing to mutable references, expressed through the concepts
of ownership and borrowing. According to the authors of 2.2 The Rust Borrow Checker
Rust [14], the three core rules of data ownership are: The borrow checker is the core system that ensures compile-
• Each value in Rust has an owner time safety guarantees in Rust, and we can consider that
• There can only be one owner at a time it has seen, up until now, three main iterations: Abstract-
• When the owner goes out of scope, its value is dropped Syntax Tree (AST) Borrow Checker, Non-Lexical Lifetimes
(NLL) [21] (later renamed to MIR Borrow Checker), and
The goal behind these rules is to ensure that data is not Polonius [30]. Although the third and newer system was
invalidated due to freeing a resource too soon, and that no announced in 2019 [22], in this work we have opted to instead
Undefined Behaviour (UB) occurs due to freeing the same focus on the previous system, the Non-Lexical Lifetimes. This
memory more than once. system is fully stable since 2022 [23], which means that it
Rust addresses these with its ownership model, as memory has been extensively tested by the community, and is much
should be automatically returned once the variable that owns better documented.
it goes out of scope. To achieve this, Rust uses the Drop trait, The borrow checker introduces the concept of lifetimes
to indicate that a given data type has resources that must be to guarantee the single ownership property. In other words,
cleaned up upon exiting its scope. This trait is used to clear to ensure that values are not mutated or moved while they
heap memory or close network connections, among others. are borrowed elsewhere. A lifetime is created anytime a loan
Foremost, this approach automatically avoids memory leaks, occurs, and it corresponds to the (possibly non-continuous)
by ensuring that any memory or resource is always freed. span of code - or region - in which a reference may be used.
In addition, thanks to the memory only being freed after its However, this term can be ambiguous, as it can also refer to
owner goes out of scope, and thus, no longer accessible, data the lifetime of a variable - or scope - which is the span of code
will never be invalid. As for freeing the same resource twice, before that variable is freed. Nonetheless, both definitions are
Rust’s ownership model ensures that each resource has a intertwined, as the scope of a variable must always outlive
single owner, therefore making it impossible to free the same all of its references. Contrarily, references pointing to freed
resource more than once. These mechanisms allow Rust to memory could lead to use-after-free bugs.
detect violations of temporal safety at compile time. To summarize the NLL algorithm as described in its respec-
When operating with ownership, an assignment such as tive Request For Comments (RFC) [21], it can be subdivided
let foo = bar may have two meanings, be it either as into the following steps:
a copy or a move operation. The compiler uses its strong
typing system to distinguish between them. It copies basic 1. Create a region variable (containing a set of points, or
data types such as integers, or structs that implement the code statements) to represent each lifetime involved
Copy trait, that is, data that can be safely copied with no side- 2. Build constraints from liveness analysis, subtyping and
effects. Likewise, immutable references can also be safely variance rules, and reborrows
copied, as they will never be able to mutate the original data. 3. Propagate the lifetimes through an inference algorithm
These copies guarantee that the original value remains valid. 4. Detect and report any violations of the borrowing rules
On the other hand, a move operation takes the ownership This system was only possible thanks to the prior in-
from one variable, and assigns it to the other. After this troduction of the Mid-level Intermediate Representation in
change of ownership, the original variable is no longer valid, 2015 [19], a new IR that rests between Rust’s AST and the
and its variable becomes inaccessible. This ensures that one generated LLVM bytecode. In short, it represents each func-
and only one owner of the data exists at any given time. tion individually as its own Control-Flow Graph (CFG). This

156
Foundations for a Rust-Like Borrow Checker for C LCTES ’24, June 24, 2024, Copenhagen, Denmark

granted two advantages to the NLL proposal, first to only sup- as checked regions, the compiler also imposes stronger restric-
port a much more restricted and normalized instruction set tions on the usage of unchecked points that could corrupt
than surface-level Rust, and second, to better synergize with the checked ones through aliasing.
the nature of the algorithms required for liveness analysis,
constraint propagation, and more robust drops. In regards to 3.2 Compiler Implementations
the latter, a separate algorithm named drop elaboration [15]
Instead of exploring the avenues of new languages or lan-
is responsible for the introduction and management of any
guage extensions, safe compiler implementations focus on
calls required to free allocated resources.
methods to improve the spatial security of existing C code.
Many works in this category change the representation of
3 Related Work pointers, effectively creating fat pointers, or "marking" re-
gions of allocated memory.
The research addressing the memory safety of C is very ex- SoftBound [25] consists of compile time transformations
tensive, with papers from the 1980s still being quite relevant for enforcing spatial safety in C, inspired by a previous
today. In Checked C [6], these techniques are categorized hardware-assisted approach, HardBound [5]. SoftBound in-
in four groups: "C(-like) dialects, compiler implementations,
troduces metadata to track the base and bounds of pointers,
static analysis, and security mitigations".
without requiring any changes to C source code, but achieves
this strictly at a software level, and performs the metadata
3.1 Dialects and Extensions manipulation whenever storing or loading pointer values.
Furthermore, a formal proof showed that SoftBound is able
One of the solutions to the lack of memory safety in C, is to
to ensure spatial safety for any program, even in the pres-
use a different language, or to by introducing changes to the
ence of arbitrary casts. These guarantees come at an average
language itself. DeLine et al. introduce Vault [4], a program-
cost of a 67% higher runtime overhead, but offers a store-
ming language with statement and expression syntax based
only checking mode that fully passed their test suite while
on C, with an additional system that allows for types to be
diminishing the overhead to 22%.
accompanied by a type guard, which consists of zero or more
Write Integrity Testing (WIT) [1] is a technique to provide
atomic predicates, each of which is a simple check for the
protection against attacks attempting to exploit memory
existence of any given key in an abstracted global state that
errors. It uses a points-to analysis at compile time to compute
Vault keeps track of. These ideas are not dissimilar to Rust’s
the CFG and the set of objects that can be written by each
ownership model, both of which are performed exclusively
instruction in the program. WIT then generates code to
at compile time.
prevent instructions from modifying any object not present
Cyclone [33] is an extension of C, it uses the same prepro-
in the set of writable objects, computed through a static
cessor, and largely follows most of C’s lexical conventions
analysis, and also to ensure that any indirect control transfers
and grammar. The major changes include the addition of
are allowed by the CFG. These techniques could be applied
static analysis, the insertion of runtime checks for operations
to C or C++ programs without requiring any source code
whose safety cannot be determined during the compilation,
modifications, with an average runtime overhead of 7%.
and custom annotations to supply hints to the static analy-
In Purify [10], Hastings et al. propose the use of a bit
sis. An example of these annotations is the "never-NULL"
table that holds a two-bit state code for each byte in the
pointers, indicated with @ instead of ∗. In general, Cyclone
heap, stack, data, and bss. The possible states are ’unallo-
aims to make program security mandatory, instead of offer-
cated’ (unwritable and unreadable), ’allocated but uninitial-
ing safe, but optional, checks which were often ignored by
ized’ (writable but unreadable), and ’allocated and initialized’
programmers. It accounts for buffer overflows, uninitialized
(writable and readable). In other words, a byte-level tagged
pointers, and more. More relevant to this work is their region
architecture, where tags represent the memory state. The bit
analysis [9], which allows Cyclone to perform an intrapro-
table maintained by purify results in a 25% memory overhead
cedural region analysis to accurately reject programs that
during development.
return dangling pointers, in a way that is not too dissimilar
to what is done in Rust.
CheckedC [6] extends C with two checked pointer types, 3.3 Static Analysis
_Ptr<T> and _Array_ptr<T>, in order to protect against Static program analysis aims to avoid the run time overhead
data modification and disclosure attacks. The first indicates of the various safe C compiler implementations mentioned
that a pointer can only be dereferenced, and no arithmetic above. These tools analyse source-code or binaries to find vul-
will be performed on it, while the second supports pointer nerabilities and bugs, such as out-of-bound-arrays accesses
arithmetic with additional bounds declarations. These checks or incorrect pointer arithmetic.
are deferred to runtime, exchanging runtime overhead for Nevertheless, such systems face a different set of chal-
coding flexibility. Within special blocks or functions marked lenges, such as scalability and false positive management. As

157
LCTES ’24, June 24, 2024, Copenhagen, Denmark Tiago Silva, João Bispo, and Tiago Carvalho

codebases grow in size, static analysis needs to balance per- also required, but we propose that this can be achieved ex-
formance and precision. A precise analyser may not be able clusively within the syntax already available in C.
to scan large programs [6]. In commercial environments this We focus exclusively on static analysis, so we do not de-
is especially critical, as many companies run static analysers scribe the addition of runtime safety checks to the gener-
as part of a nightly pipeline, and if the analyser is unable ated code, such as bounds or overflow checks, as they could
to produce meaningful results within that time frame, its theoretically be achieved by layering previous work in this
effectiveness is questioned. domain on top of the code output by the C borrow checker.
Coverity [2] explored the management of false positives, We have decided to target the C99 standard [11] for the
which can be more problematic than it initially appears, as output code, in order to take advantage of the restrict
people tend to disregard issues found as false positives, thus qualifier, first introduced by it, as well as to ensure a high
allowing true bugs to slip by. As such, a cycle of low trust degree of retro-compatibility with other compilers and build
starts building, causing complex bugs to be labelled as false toolchains.
positives, which feeds back into this cycle. Coverity faced
this problem first hand, through their years of experience 4.1 Normalization
making their static analyser commercially viable. In their Although operating over a source-to-source compiler already
words, "when forced to choose between more bugs or fewer provides an intermediate representation to work with, fun-
false positives, we typically choose the latter". Their aim is damentally, reasoning with the entirety of C syntax it still
to always maintain a false positive rate below 20% in their required. As such, we normalize the code into a simpler form,
stable checkers. in order to simplify the analysis, through a series of transfor-
An approach that attempts to emulate Rust will fit into this mations applied to the code before the main analysis begins.
category, as it will perform static analysis over the source Matos [18] has concluded that several normalization steps
code, however it addresses these two problems. On one hand, can be safely performed without affecting the final perfor-
by having well-defined rules regarding the lifetimes of func- mance of the compiled code, removing the need to reverse
tion parameters and return values, the analysis can be exclu- any of the transformations. We aim to normalize the code
sively done at an intra-procedural level, making it scalable. into a form similar to that of Rust’s MIR, which includes the
On the other hand, a Rust-like approach is much more re- following transformations:
strictive regarding the kind of code it accepts, even code that
otherwise could be considered safe, which in practice means • Break up variable declarations, such that each state-
trading false positives with false negatives. ment only declares a single variable
• Remove variable shadowing (i.e. redeclaring a variable
with the same name in an inner scope)
3.4 Security Mitigation • Remove operations with side effects such as ++ and --,
Security mitigations are exclusively runtime mechanisms and replace them with their equivalent += 1 and -= 1
able to either detect whenever memory is corrupted, or pre- operations
vent attackers from manipulating systems after such cor- • Replace assignments such as += and -= by their explicit
ruptions. Some of these techniques are already prevalent in form, such as a = a + b
widely adopted compilers such as gcc or Clang. • Ensure each rvalue of every assignment is either:
On the one hand, mechanisms such as address-space lay- – A variable or immediate
out randomization, control-flow integrity, and data execution – At most one unary operator applied in succession to
prevention, focus on the detection and prevention of arbi- a variable or immediate
trary code execution and control-flow modification. On the – A single binary operator between two variables or
other hand, stack protection mechanisms, like stack canaries immediates
or shadow stacks, focus on the protection of data and return – A single inter-procedural call, whose arguments are
addresses on the stack. either variables or immediates
• Normalize x->y arrow operator into parenthesis and
field access operator (*x).y
4 Foundations for a C Borrow Checker
We propose a system to replicate in C most of the safety 4.2 Memory Management
guarantees provided by Rust’s borrow checker. This can be One of the key aspects of a proper borrow checker is en-
achieved through a source-to-source approach, based on suring that all resources are appropriately freed. In Rust,
static analysis of annotated source code, and the generation the Drop trait automatically guarantees that any allocated
of a new version of the source code with the changes neces- resources are freed when a variable goes out of scope or is
sary to uphold the desired safety guarantees. Consequently, reassigned. We took inspiration from the drop elaboration
a method to replicate the ownership semantics of Rust is process of Rust, and applied a version of it to C, effectively

158
Foundations for a Rust-Like Borrow Checker for C LCTES ’24, June 24, 2024, Copenhagen, Denmark

simulating the RAII (Resource Aquisition Is Initialization) The first major hurdle of attempting to transition such a
pattern, popular in languages such as C++. With this in mind, concept into C is the lack of generics. In order to circumvent
Rust’s drop systems are built on top of a set of assumptions this issue, we propose the creation of a wrapper struct
that we must also safeguard in C, if we are to implement a for each type we wish to box, and further utilize macros to
similar solution, some of which are, in no particular order: automatically generate the necessary wrapper code for each
type. In Rust, a Box always has the Drop trait associated with
1. Structs cannot be partially initialized it, so the language automatically guarantees the safe cleanup
2. Structs that implement Drop cannot have data moved of that allocated memory. For the C borrow checker, we will
out of their fields need to generate and introduce the proper calls to free.
3. References in fields must always point to valid memory Furthermore, Rust disallows any direct calls to the allo-
at the moment of the drop, except for fields marked cator outside of unsafe code, and instead encourages the
with [may_dangle] use of the provided smart pointers, such as Box, Rc and Arc,
4. Null references are not allowed or the Vec data structure. While we can provide generators
for specific instances of the smart pointers, Vec is very dif-
Any violations of the first two points can be detected by a
ferent, due to requiring its allocated memory to be resized
pass iterating over every assignment. However, as restricting
dynamically.
C to force the complete initialization of long structs would
be undesired, we can provide builder functions or define 4.4 Replicating Rust Syntax through Annotations
syntactic sugar for our borrow checker. Although neither
As C does not have concepts such as ownership, borrowing,
language has the concept of "objects", Rust has also opted
or lifetimes, we need to introduce new syntax in order to pro-
for a similar solution, for example, with the Default trait or
vide the information required for our analysis. Conversely,
through the ability to initialize uninitialized fields from an-
we do not wish to create a dialect of C, as it would go against
other struct. In spite of C not defining any syntax for such
our goal of retro-compatibility with existing compilers, and
operations, we can take advantage of the level of access that
of preserving idiomatic C code. As such, we make extensive
source-to-source provides to introduce #pragma directives
use of annotations through #pragma directives to provide
that will initialize any missing fields to the desired values.
additional information to the compiler, a common practice
Let us assume that a struct A contains a pointer to an-
in C compilers.
other struct B. The third assumption ensures that any at-
Firstly, to replicate the ownership and borrowing syntax of
tempt to free the resources allocated by B is always safe, as if
Rust, we leveraged the built-in type system and the restrict
the inner pointer was pointing into invalid memory, it could
qualifier. In short, it indicates that any data directly or in-
result in reading into an invalid address. Arguably, the most
directly accessible through a given pointer p can only be
difficult assumption to maintain for our C borrow checker
accessed through p, or a pointer derived from it. Paraphras-
would be the complete removal of null references, due to
ing from the C99 standard [11], let 𝑃 be a restrict-qualified
the predominant usage of NULL in C, especially throughout
pointer to a type 𝑇 , and 𝐵 denote the block it was declared
its standard library, and merits future consideration. Our
on. Particularly important is the based on definition, which
current plans to address this issue lie in TypeScript-like type
states that a pointer expression 𝐸 is based on 𝑃 if (at some
annotations outside of the C type system (e.g., Object |
sequence point in the execution of 𝐵 prior to the evaluation
undefined ), in order to distinguish between a maybe-null
of 𝐸) modifying 𝑃 to point to a copy of the array object into
"raw" pointer, and a safe memory reference that is ensured
which it formerly pointed would change the value of 𝐸.
by the borrow checker. This would also serve as a proxy to
Whilst this keyword may be ignored by some compiler
Rust’s boundary between "safe" and "unsafe" code.
implementations, it can be leveraged to generate more op-
timized assembly. In most C code, it is rare to find uses of
4.3 Heap Memory restrict, mainly due to how predominantly loose the use
Within safe Rust, most interactions with heap memory are of pointer aliasing is. In turn, this drastically limits its appli-
performed through abstractions like Box<> or other structs cability, as if pointer aliasing does occur, it constitutes UB.
that use boxes internally, which ensures that the allocated Even so, it fits perfectly within the scope of this work, as we
memory is always freed at the end of a code block. As a can ensure that, by definition, there can only be one mutable
simplification, we may consider a Box<> as a simple wrapper borrow in-use at any given point of a program’s execution.
of a pointer to a heap-allocated value. When creating the Interestingly, in the paper regarding the Stacked Borrows
box, that memory is allocated, and when it goes out of scope aliasing model, Jung [13] confirms that "the Rust compiler
or is reassigned, the implementation of Drop ensures that its (which uses LLVM as its backend) used to emit the LLVM
memory is returned to the allocator. Moreover, the Box is an equivalent of restrict as annotations for mutable references
incredibly useful and prevalent concept thanks to its generic in function argument position". This establishes a link be-
nature, which allows it to easily encapsulate any type. tween the aliasing rules of Rust and LLVM’s noalias, which

159
LCTES ’24, June 24, 2024, Copenhagen, Denmark Tiago Silva, João Bispo, and Tiago Carvalho

in turn, according to LLVM’s documentation on parameter Listing 1. Simple function demonstrating the syntax for
attributes, is intentionally based on C99’s restrict qualifier explicit lifetime annotations
for function parameters, with some differences regarding
the effects of returning a restrict-qualified pointer. #pragma l i f e t i m e x %a
Given the link between the restrict qualifier and muta- #pragma l i f e t i m e y %b
ble borrows, it is only logic for us to reflect this on our syn- #pragma r e t u r n _ l i f e t i m e %a
tax, and represent every mutable borrow as a * restrict- const i n t ∗ b a r ( const i n t ∗ x ,
qualified pointer. Or at least, so we would wish, as the dan- const i n t ∗ y ) {
gers of not respecting the restrict qualifier would lead return x ;
straight to UB. The C99 standard [11] was unclear in its }
specificity of which behaviours were standardized, blurring
the lines of what constitutes UB, and it would appear that
this has not changed ever since it was first introduced. inform the borrow checker about the behaviour of a struct.
Once again, Jung [13] also arrived at the conclusion that In cause are the Drop trait for safe memory management,
the semantics of restrict are unclear, in particular outside and the Copy trait, for flexibility and performance.
of function parameters. Historically, this matter has been Lastly, annotations are also required for named lifetimes
error-prone, and led to reported bugs in LLVM [8, 27], even throughout the code, mainly required by function parame-
in cases that closely followed the formal definition. Although ters, and return values with reference types or struct dec-
these bugs have since been patched as of LLVM 12, they had larations. An example of these annotations is presented in
repercussions on the Rust ecosystem, which had to temporar- Listing 1. A subset of named parameters lifetimes can be au-
ily halt the use of the LLVM noalias semantics, as the bugs tomatically inferred by the process of lifetime elision [34]. In
led to incorrect results, which were even reproducible in short, elision sets a different lifetime for each unspecified pa-
C code1 . Ultimately, it is nearly impossible to predict the rameter and return value lifetime. Afterwards, in the case of
behaviour of restrict for use in general pointers. To make only one lifetime existing in both the parameters and return
the situation even more dire, there is a clear lack of tools to value, they are merged into the same lifetime. If any lifetime
check for violations of the restrict properties. in the return value remains unmatched with a lifetime from
To summarise, the utilization of the restrict qualifier the parameters, the function definition is declared illegal, as
would be ideal, as it could allow compilers to theoretically it constitutes either a dangling pointer or a lifetime whose
optimize programs accepted by our borrow checker, and origin is ambiguous.
it would also follow the guidelines of Rust, by requiring
explicit declarations of variable mutability. However, due 4.5 Borrow Checker Algorithm
to the lack of a clear definition of its semantics, as well To replicate the borrow checker itself, we largely follow the
as the lack of tools to check for possible violations, they Non-lexical lifetimes algorithm as described by Matsakis [21],
should be handled with extreme caution. Fortunately, as our with few to no adaptations. As such, we directly translated
solution is implemented at a source-to-source level, so we can the concept of lifetimes (or region variables), the rules for
freely manipulate the pointer qualifiers to our advantage. As generating the various constraints, the inference algorithm,
such, we suggest three avenues to explore in the future, with the in-scope loans, and the final error detection phase. The
different degrees of conservativeness, with the definition of largest divergence from Rust lies in the handling of some
mutable borrows through * restrict pointers in common: edge cases, such as two-phase borrows [20], due to Rust’s
• Keep every * restrict pointer as-is, and hope that desugaring on nested functions or "methods".
the compiler can highly optimize the code, but at the
significant risk of generating UB
5 Implementation
• Keep * restrict pointers as-is for function argu- We developed a small proof-of-concept prototype of a C
ments, and remove the qualifier for all other cases, borrow checker to demonstrate the feasibility of a subset
which should still lead to some optimizations, but dras- of the techniques proposed, using the Clava [3] source-to-
tically reduce the chance of UB source compiler to perform its analysis and transformations.
• Remove every * restrict qualifier, removing a useful Clava provides a custom intermediate representation (IR)
vector for program optimization, constituting the least heavily inspired by Clang’s [16] AST. More specifically, it
performant, but safest, option uses Clang to parse the source code, and builds its custom
IR from Clang’s AST, inheriting a similar structure to Clang.
In regards to the annotation of struct traits, whilst our
aim is not to replicate the entirety of the trait system, at the 5.1 System Design
very minimum, a representation is required for the traits that
On a high-level, it is divided into two steps, first normaliza-
1 https://github.com/rust-lang/rust/issues/54878 tion, and then analysis. The normalization step is responsible

160
Foundations for a Rust-Like Borrow Checker for C LCTES ’24, June 24, 2024, Copenhagen, Denmark

for transforming the input program into a form that is eas- present in Figure 1. The order in which they are analysed is
ier to analyse, and is applied globally. The analysis step is not relevant, and, in theory, they could even be performed
responsible for performing the actual analysis required to in parallel.
perform the borrow checker algorithm, but is applied at an
intraprocedural level. Analysis
The C AST differs significantly from that of Rust’s, and
#1 #2 #3
we had no intention of mimicking the internal structures
Construct CFG Liveness Analysis Annotate Graph
of the rustc compiler, as we had no guarantees they would
adequately translate to C. In spite of that, various compo-
nents naturally wound up resembling some of the classes and #6 #5 #4
Error Detection Inference Algorithm Constraint Generation
structures used by rustc. The best example of this lies in the
representation of paths, as well as the connection between
a path, as part of a value expression, and the corresponding #7
type it would evaluate to. Error Reporting

Internally, we utilized compilation passes to perform the


various analysis required for the borrow checker, allowing
for an easy integration with other transformations already
Figure 1. Diagram describing the seven steps of the borrow
provided by Clava. This also aided its development, as it
checker’s analysis
allows for the analysis to be performed in a step-by-step
fashion, and for the results of each step to be easily inspected.
In regards to type checking, we mostly rely on Clang to
5.2 Normalization Stage perform it for us, as it is internally used by Clava to generate
Of the various normalization options previously discussed, its own internal intermediate representation. If it used gcc
we have focused on the most crucial ones in order to achieve instead, for example, the definition of some extra flags would
an MIR-like representation of the input program. In other be required, such as -Wdiscard-qualifiers, in order to
words, the focus was on eliminating all nested expressions. prevent the creation of a mutable pointer from an object
Following this, we decompose every line of code into a marked as const.
LVALUE "=" RVALUE expression, not dissimilar to Rust. This To elaborate on each analysis step, we started by con-
allows further analysis to not have to reason about the order structing a Control Flow Graph (CFG), built internally over
of operations, and instead focus on the individual operations a Cytoscape.js [7] graph, a graph theory and networking li-
themselves. The normalization also normalizes certain con- brary, which comes prepackaged with various graph theory
structs, such as the ternary operator, which is transformed algorithms and allows for the simple manipulation of nodes,
into an if-else statement. especially with the usage of the scratch pad. The scratch pad
Most of the statement decomposition utilized the work is a key-value store associated with each node, and can be
of Matos [18], included in the Clava source-to-source com- used to store arbitrary information. It was extensively used,
piler [3]. These transformations introduce multiple tempo- thanks to allowing for the storage of relevant information
rary variables, which are then used to represent the interme- directly in the nodes themselves, as well as for the consump-
diate values of the expression. Every temporary variable has tion of it in later steps, without the need for recalculations.
a unique identifier, in the format of TMP_n, where n unique Moreover, it also allows for multiple objects to only be in-
identifies the variable. This is important, as it ensures that stantiated once, and then reused throughout the analysis,
some later value expressions will only be composed of a as is the case for region variables and loans. In turn, this
single memory location. allows for the direct comparison of objects, instead of having
The removal of shadowing (re-declaration of a variable to compare their attributes, which allows for the usage of
with the same name, in an inner scope) was also extremely JavaScript’s built-in Set data structure.
helpful, since it allowed later analyses to assume that every For the second step, a liveness analysis is performed, con-
variable is unique, and thus could be used as an identifier to figured to separate each code block into individual lines of
easily match an expression to the type produced by it. code, resulting in a single unique string to identify each
operation in the decomposed function body. For the correct-
5.3 Analysis Stage ness of NLL, however, drop statements must be disregarded
The analysis stage represents the bulk of the prototype, and during liveness. Fortunately, by temporarily representing
is performed as a series of intraprocedural analyses, by iter- drops as a #pragma directive, they are safely ignored by the
ating through every function definition, and applying a full liveness algorithm.
sequence of analyses and transformations to the respective Thirdly, we annotate each node in the CFG with the data
body. For each, it performs a sequence of seven steps, as required by later analysis, such as marking "MIR actions"

161
LCTES ’24, June 24, 2024, Copenhagen, Denmark Tiago Silva, João Bispo, and Tiago Carvalho

(i.e. an assignment or function call) for error detection, iden- nodes visited, as it visits exclusively nodes whose in-scope
tifying borrow expressions, or adding typing annotations. loans may have changed. Finally, having calculated the in-
In order to avoid unnecessarily iterating over the function scope loans, a final iteration through the CFG is performed,
body, every annotation is created in a single pass, and then and any in-scope loan that violates an annotated Access
cached into the node itself. This step tackles some of the is detected, according to the rules delineated in the NLL
larger road blocks of implementing a C borrow checker, due RFC [21]. If such a violation is found, an error is reported,
to the language lacking much of the required syntax. and the analysis is terminated.
To surpass those challenges, we first simulated a pseudo- The final step reports any errors caught in the previous
type system, built on top of the Clang AST’s types, with step in a clear human-readable format. We have opted to
the addition of support for lifetimes in reference types and report the errors similarly to rustc, by printing the error
structs. Moreover, a second hierarchy was created to repre- message, accompanied by the relevant statements, with the
sent every possible lvalue path, this is, every possible expres- error highlighted using the three-point "narrative", similar
sion that represents a valid memory address. This allows for to the three-act trope in traditional story-telling:
both hierarchies to be interconnected, as when an lvalue path • The point in which the loan is created, 𝐵
is evaluated, it produces a specific type, possibly with life- • The point which might have invalidated the borrow, 𝐴
times attached. These paths are then further used to identify • The point in which its reference was later used, 𝑈
loans, as well as to categorize each memory access according
to a matrix of read or write, and deep or shallow. In order to provide this "narrative", the three points must
The fourth step is responsible for generating the con- ensure that 𝑈 is reachable from 𝐴, and 𝐴 is reachable from
straints required for the NLL algorithm, as described by 𝐵. In turn, errors can be described as a series of "acts", where
Matsakis [21]. Liveness constraints are directly applied to first we create the borrow in 𝐵, then present the point of error
the initial set of each RegionVar, and are never instanti- 𝐴, and finally, the next use of the reference 𝑈 . This was con-
ated. The remaining outlives constraints are stored in a list, sidered a significant improvement over previous iterations
to be utilized later. Of these, subtyping constraints require by the Rust development team [21], which previously only
reasoning adapted to each variable type’s variance [28] [29]. reported the points of error and creation, but not the next
The fifth step propagates these constraints across the re- use of the reference. This is especially useful for use after
gion variables, or lifetimes, which constitutes a fixed-point free errors, by allowing the programmer to know precisely
iteration problem. In short, each constraint is iterated over where the reference was freed, and later used.
and applied until no further changes occur. Each RegionVar Thanks to the structure of our annotations in the CFG,
is composed of its unique id, and the set of points that com- we are able to easily identify which memory access caused
pose its region, expressed as strings to account for the end the error, as well as the point in which the loan was created.
regions of universal lifetimes (such as named parameters Finding the point in which the borrow is next used, however,
or static). Each constraint defines an outlives relation be- is more complicated. Instead of reporting the first usage of
tween two RegionVars and the point 𝑃 in which it applies. the reference after the access, we opted to report the last
For example, to propagate a ( ′𝑎 : ′𝑏) @ 𝑃 constraint is to usage of the borrow. To find it, the built-in methods from
ensure that every point (or statement) reachable from point Cytoscape.js [7] were used to perform a simple DFS search to
𝑃 that is contained in the set of region variable ′𝑏 must be in- find the last point of the CFG, reachable from the statement
cluded in the set of region variable ′𝑎. This was implemented of the incorrect access, that is still included in the borrow’s
as a DFS visitor that added the id of any point visited into RegionVar. In the case of the "last" usage being inside a
the region set of ′𝑎. conditional branch, and the other branch still lead to more
The sixth step is responsible for the detection of errors, usages, it is still a perfectly acceptable point to report, as
based on the information computed so far. The error detec- in that case, both paths would constitute a violation of the
tion algorithm requires another dataflow analysis to calculate borrow checker.
the set of loans active in each statement, this is, the in-scope
loans. To implement it, we have utilized a variation on the 5.4 Results
iterative worklist algorithm described by Muchnick [24]. In From a performance standpoint, apart from the first global
this algorithm, we consume nodes from a queue without normalization pass, every function is processed exactly once.
duplicate entries (enforced through an auxiliary Set), initial- We opted to not use a queue to exclusively visit used func-
ized to every node in the CFG. While the queue is not empty, tions, as it would force the existence of an entry point to
a node is removed from the queue, and its in-scope loans are the program. Conversely, processing every definition exactly
calculated based on the out-scope loans of its predecessors. once allows for the analysis of code without a main (e.g.,
If the in-scope loans of the node change, its successors are libraries), as well as allowing the possibility of analysing an
added to the queue, and the process is repeated until the isolated selection of functions, helpful for an incremental
queue is empty. This algorithm minimizes the number of conversion of existing code. With regards to memory, as

162
Foundations for a Rust-Like Borrow Checker for C LCTES ’24, June 24, 2024, Copenhagen, Denmark

Table 1. Set of constraints generated from Listing 2


Listing 2. Variation upon one of the main examples pre-
sented in the NLL RFC [21]
Region Points
void u s e ( const i n t ∗ a ) {
′1 {𝑖𝑑_11, 𝑖𝑑_15, 𝑖𝑑_17, 𝑖𝑑_18, 𝑖𝑑_6, 𝑖𝑑_7,
} 𝑖𝑑_8, 𝑖𝑑_9}
′2 {}
′3 {}
i n t main ( ) {
int foo = 1 ; Outlives Constraints
int bar = 2 ;
( ′ 2 : ′ 1) @ 𝑖𝑑_6
const i n t ∗ p ;
( ′ 3 : ′ 1) @ 𝑖𝑑_15

p = &f o o ;
if (2 > 1) { Table 2. Final lifetimes computed from the constraint set in
use ( p ) ; Table 1
foo = 4 ;
/ / Other p r o c e s s i n g Region Points
p = &b a r ; ′1 {𝑖𝑑_11, 𝑖𝑑_15, 𝑖𝑑_17, 𝑖𝑑_18, 𝑖𝑑_6, 𝑖𝑑_7,
/ / More p r o c e s s i n g 𝑖𝑑_8, 𝑖𝑑_9}
′2 {𝑖𝑑_11, 𝑖𝑑_17, 𝑖𝑑_18, 𝑖𝑑_6, 𝑖𝑑_7, 𝑖𝑑_8, 𝑖𝑑_9}
}
′3 {𝑖𝑑_15, 𝑖𝑑_17, 𝑖𝑑_18}
foo = 8 ;
use ( p ) ;
return 0 ;
} The validation of the results produced by each test pro-
gram was divided into three parts. First, a graphical rep-
resentation of the CFG with the cached annotations data
directly present was generated, and then confronted against
the expected values. This allowed for a simple confirmation
of both the normalization stage, as well as the first half of
there is no need to preserve the CFG and its annotations the borrow checker algorithm. Afterwards, we compared
between the analysis of different functions, they are deleted the set of constraints generated from the code, such as in
between each analysed function, reducing the overall mem- Table 1. After confirming the correctness at this stage, we
ory required to run the borrow checker. then manually calculate the expected values for each region
From the standpoint of system design, we have opted variable, or in other words, lifetime, and compare them to
to create a custom simplified intermediate representation, the program output, such as in Table 2.
modelled after Rust’s MIR, and to perform the analysis over Finally, given the importance of producing good error re-
it. This allowed us to leverage existing work on the Clava ports in the case of failure, we created an equivalent program
compiler, as well as to hide the complexity of the Join Points in Rust, and compared the errors produced by our prototype
(which are strongly tied to Clang’s AST) behind a layer of against those reported by the rustc compiler. Our errors are
abstraction. This was expressed as a set of annotations built comprised of the same three-point "narrative" structure uti-
on top of the CFG built for each function body. lized by Rust. We can verify that it correctly identified the
We have tested our prototype against programs inspired assignment to foo in id_18 as the violation of its borrow in
by the NLL RFC [21], alongside other small synthetic pro- p. It also correctly identified the last usage of the borrow in
grams to isolate some edge cases. Because the goal of the id_15 as the last point in which the borrow is still in use.
borrow checker is to accurately detect and report violations Unfortunately, due to the normalization and various trans-
of the ownership and borrowing rules, we have also focused formations applied, the lines indicated in the error do not
on creating variations of those examples, in which errors accurately correspond to those of the original source code
were purposefully introduced. One such variation is shown file. To accurate report the original location, altering Clava
in Listing 2, in which two final assignments to variable the would be required. The incorrect line numbers are, however,
foo were added. The first one, inside the if block, is legal, a minor issue, as the error is still correctly reported. Further-
as it is after the value borrowed in p is no longer in use. The more, the user has access to the normalized code, in which
second, however, is illegal, as it is performed while foo is the line numbers are correct, facilitating the process of an
still borrowed inside p. error back to the original source code.

163
LCTES ’24, June 24, 2024, Copenhagen, Denmark Tiago Silva, João Bispo, and Tiago Carvalho

The prototype is implemented as a set of scripts for the Clava


source-to-source compiler, meeting the requirements of a
compiler-agnostic solution, while also being easily integrated
into existing build environments, such as cmake.
We have evaluated the prototype by writing equivalent
programs in C and Rust, and comparing its error reporting
to that of the latest version of the Rust compiler. The results
showed that it is able to accurately detect the same intended
Figure 2. Error produced by the rustc compiler for a program class of errors, and that the error messages produced were
equivalent to Listing 2, via https://play.rust-lang.org also similar in nature, albeit less refined, due to some in-
formation loss in parts of the source-to-source compiler’s
normalizations. This is a clear indication that the core ideas
presented in this work are sound, and that with further in-
vestigation, could be further developed into a more viable
and feature-complete tool for improving the safety of C code.

Figure 3. Error produced by the prototype for Listing 2


6.1 Future Work
We believe we have laid the groundwork for a more complete
6 Conclusions and thorough C borrow checker. However, in order to design
Every year over the last decade, CVE vulnerabilities associ- a comprehensive borrow checker ready to be integrated into
ated with memory safety are a significant portion of newly the compilation pipeline of a true project, various issues need
discovered problems with C and C++[32], as developers in- to be tackled.
advertently insert memory corruption bugs. This is a clear Firstly, this proposal still lacks a comprehensive solution
indication that the current approaches to memory safety are for the handling of arrays, with a focus on indexing and
not sufficient, and that new approaches are required. Simul- slicing operations, as well as its initialization and respective
taneously, we have seen the rise of languages promoting calls to the allocator. Alongside this support, we also wish to
memory safety without compromising their runtime perfor- study the addition of runtime checks that also ensure spatial
mance as their flagship, namely Rust. safety, such as bounds checking.
In this work, we have identified another avenue to im- The second largest challenge lies on the integration of calls
prove the memory safety of C: the application of the own- to the standard library, as well as other third-party C libraries.
ership and borrowing model that form the core of Rust. To We believe supporting the standard library could be realis-
that effect, we identified the key algorithms behind the bor- tically achieved by establishing a "contract" similar to the
row checker, and proposed methods to retrofit them into C boundary between safe and unsafe Rust, as well as through
through source-to-source compilation. This resulted in a sys- wrappers suitable to analyses pertaining to ownership. Sup-
tem capable of enforcing comparable ownership and aliasing port for third-party C libraries, especially if distributed as
rules in C source code, whilst ensuring no compiler lock-in pre-compiled binaries, could prove to be particularly prob-
and simplifying any verification of a program’s correctness. lematic, since our proposal relies exclusively on an annotated
The key feature of this system is automatic memory manage- version of a project’s source code. Ultimately, it would al-
ment, especially for struct types, achieved through the use low for the expansion of our test suite to include real-world
of the RAII paradigm, the application of Drop Elaboration applications and well-regarded benchmarks.
to ensure every resource is freed once-and-only-once, as Further testing of various compilers is also required in
well as the checks of a custom borrow checker that ensures order to evaluate the consequences (if any) of using the
no concurrent mutable memory operations occur, or that a restrict qualifier to identify mutable borrows, mainly due
dangling pointer is never dereferenced. to the relatively unclear nature of the C99 standard on this
We have implemented a prototype for a subset of the anal- topic. Furthermore, compilers are free to ignore this qualifier
ysis and transformations required for this system, which when optimizing the generated binaries.
serves as a proof of concept of the core ideas presented Finally, we also expect some improvements towards more
in this work. Its main goal is to replicate the Non-Lexical accurate tracking of the location of an error on the original
Lifetimes-based borrow checker, an intraprocedural analysis source, as well as the extension of the prototype to gradually
that enforces the ownership and aliasing rules, with an em- cover more complex cases, mainly revolving around drop
phasis on the calculation of lifetimes (or region variables). elaboration, structs, and incomplete types.

164
Foundations for a Rust-Like Borrow Checker for C LCTES ’24, June 24, 2024, Copenhagen, Denmark

References [17] Michael Ling, Yijun Yu, Haitao Wu, Yuan Wang, James R. Cordy, and
[1] Periklis Akritidis, Cristian Cadar, Costin Raiciu, Manuel Costa, and Ahmed E. Hassan. 2022. In Rust We Trust – A Transpiler from Un-
Miguel Castro. 2008. Preventing Memory Error Exploits with WIT. In safe C to Safer Rust. In 2022 IEEE/ACM 44th International Conference
2008 IEEE Symposium on Security and Privacy (Sp 2008). IEEE, Oakland, on Software Engineering: Companion Proceedings (ICSE-Companion).
CA, USA, 263–277. https://doi.org/10.1109/SP.2008.30 IEEE, Pittsburgh, PA, USA, 354–355. https://doi.org/10.1109/ICSE-
[2] Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Companion55297.2022.9793767
[18] João Matos. 2022. Automatic C/C++ Source-Code Analysis and Normal-
Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson
ization. Ph. D. Dissertation. Universidade do Porto.
Engler. 2010. A Few Billion Lines of Code Later: Using Static Analysis
[19] Niko Matsakis. 2015. RFC 1211: Mir. https://github.com/rust-lang/rfcs/
to Find Bugs in the Real World. Commun. ACM 53, 2 (Feb. 2010), 66–75.
blob/debadbae2c7fc6cf2d94aef61c08f60b2e6ed297/text/1211-mir.md
https://doi.org/10.1145/1646353.1646374
[20] Niko Matsakis. 2017. RFC 2025: Nested-Method-
[3] João Bispo and João M. P. Cardoso. 2020. Clava: C/C++ Source-to-
Calls. https://github.com/rust-lang/rfcs/blob/
Source Compilation Using LARA. SoftwareX 12 (2020), 100565. https:
188cc17ad38b201867955fb4a51c306c0704b6cf/text/2025-nested-
//doi.org/10.1016/j.softx.2020.100565
method-calls.md
[4] Robert DeLine. 2001. Enforcing High-Level Protocols in Low-Level
[21] Niko Matsakis. 2017. RFC 2094: Nll. https://github.com/rust-lang/rfcs/
Software. PLDI ’01: Proceedings of the ACM SIGPLAN 2001 conference on
blob/abc967a2c5ddd0af2d3506897be7ecfbc0e78e97/text/2094-nll.md
Programming language design and implementation (May 2001), 59–69.
[22] Niko Matsakis. 2019. Polonius and Region Errors. https:
[5] Joe Devietti, Colin Blundell, Milo M. K. Martin, and Steve Zdancewic.
//smallcultfollowing.com/babysteps/blog/2019/01/17/polonius-
2008. Hardbound: architectural support for spatial safety of the
and-region-errors/. (accessed 2023-09-20).
C programming language. In Proceedings of the 13th International
[23] Niko Matsakis. 22. Non-Lexical Lifetimes (NLL) Fully Stable. https:
Conference on Architectural Support for Programming Languages and
//blog.rust-lang.org/2022/08/05/nll-by-default.html. (accessed 2023-
Operating Systems (Seattle, WA, USA) (ASPLOS XIII). Association
08-11).
for Computing Machinery, New York, NY, USA, 103–114. https:
[24] Steven S. Muchnick. 1997. Advanced Compiler Design and Implementa-
//doi.org/10.1145/1346281.1346295
tion. Morgan Kaufmann.
[6] Archibald Samuel Elliott, Andrew Ruef, Michael Hicks, and David
[25] Santosh Nagarakatte, Jianzhou Zhao, Milo M K Martin, and Steve
Tarditi. 2018. Checked C: Making C Safe by Extension. In 2018 IEEE
Zdancewic. 2009. SoftBound: Highly Compatible and Complete Spa-
Cybersecurity Development (SecDev). IEEE, Cambridge, MA, 53–60.
tial Memory Safety for c. Technical Report MS-CIS-09-01. University
https://doi.org/10.1109/SecDev.2018.00015
of Pennsylvania Department of Computer and Information Science
[7] Max Franz, Christian T. Lopes, Gerardo Huck, Yue Dong, Onur Sumer,
Technical.
and Gary D. Bader. 2015. Cytoscape.Js: A Graph Theory Library for Vi-
[26] George C Necula, Scott McPeak, and Westley Weimer. 2002. CCured
sualisation and Analysis. Bioinformatics (Oxford, England) 32, 2 (Sept.
Type-Safe Retrofitting of Legacy Code. POPL ’02: Proceedings of the
2015), 309–311. https://doi.org/10.1093/bioinformatics/btv557
29th ACM SIGPLAN-SIGACT symposium on Principles of programming
arXiv:https://academic.oup.com/bioinformatics/article-
languages (Jan. 2002), 128–139.
pdf/32/2/309/49016536/bioinformatics\_32\_2\_309.pdf
[27] Nikita Popov. 2018. Loop Unrolling Incorrectly Duplicates Noalias
[8] Dan Gohman. 2015. Incorrect Liveness in DeadStoreElimination. Tech-
Metadata. RFC 9405. https://bugs.llvm.org/show_bug.cgi?id=39282
nical Report 25422. LLVM bugs. https://bugs.llvm.org/show_bug.cgi?
[28] Rust Community. 2014. The Rust Language Ref-
id=25422
erence. https://github.com/rust-lang/reference/tree/
[9] Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling
effbdc1b059fde09027925e1bea90bb1860d5f27. (accessed 2023-
Wang, and James Cheney. 2002. Region-Based Memory Management
09-05).
in Cyclone. In Proceedings of the ACM SIGPLAN 2002 Conference on
[29] Rust Community. 2015. The Rustonomicon. https://github.com/rust-
Programming Language Design and Implementation (PLDI ’02). As-
lang/nomicon/tree/302b995bcb24b70fd883980fd174738c3a10b705. (ac-
sociation for Computing Machinery, New York, NY, USA, 282–293.
cessed 2023-08-03).
https://doi.org/10.1145/512529.512563
[30] Rust Community. 2018. Polonius Book. https://github.com/rust-
[10] Reed Hastings and Bob Joyce. 1992. Purify: Fast Detection of Memory
lang/polonius/tree/0a754a9e1916c0e7d9ba23668ea33249c7a7b59e. (ac-
Leaks and Access Errors. Proceedings of the Winter 1992 USENIX
cessed 2023-09-13).
Conference (1992), 125–138.
[31] L. Szekeres, M. Payer, Tao Wei, and Dawn Song. 2013. SoK: Eternal
[11] ISO. 1999. ISO/IEC 9899:1999 - Programming Languages - C.
War in Memory. In 2013 IEEE Symposium on Security and Privacy. IEEE,
[12] Ralf Jung. 2023. From Stacks to Trees: A New Aliasing Model for Rust.
Berkeley, CA, 48–62. https://doi.org/10.1109/SP.2013.13
[13] Ralf Jung, Hoang-Hai Dang, Jeehoon Kang, and Derek Dreyer. 2019.
[32] Gavin Thomas. 2019. A Proactive Approach to More Secure Code.
Stacked Borrows: An Aliasing Model for Rust. Proc. ACM Program.
[33] Jim Trevor, Greg Morrisett, James Cheney, Dan Grossman, Michael
Lang. 4, POPL, Article 41 (Dec. 2019). https://doi.org/10.1145/3371109
Hicks, and Yanling Wang. 2002. Cyclone: A Safe Dialect of C. In
[14] Steve Klabnik and Carol Nichols. 2018. The Rust Programming Lan-
2002 USENIX Annual Technical Conference (USENIX ATC 02). USENIX
guage. No Starch Press.
Association.
[15] Felix Klock II. 2014. RFC 0320: Nonzeroing-Dynamic-
[34] Aaron Turon. 2014. RFC 0141: Lifetime-Elision. Technical Report 0738.
Drop. https://github.com/rust-lang/rfcs/blob/
Rust Foundation.
abc967a2c5ddd0af2d3506897be7ecfbc0e78e97/text/0320-nonzeroing-
dynamic-drop.md
Received 2024-02-29; accepted 2024-04-01
[16] Chris Lattner. 2011. LLVM and Clang: Advancing Compiler Technol-
ogy.

165

You might also like