native implementation of mutable value semantics
native implementation of mutable value semantics
Google Google
United States United States
ABSTRACT 1 INTRODUCTION
Unrestricted mutation of shared state is a source of many well- Software development continuously grows in complexity, as ap-
known problems. The predominant safe solutions are pure func- plications get larger and hardware more sophisticated. One well-
tional programming, which bans mutation outright, and flow sen- established principle to tackle this challenge is local reasoning, de-
sitive type systems, which depend on sophisticated typing rules. scrbibed by O’Hearn et al. [4] as follows:
Mutable value semantics is a third approach that bans sharing in-
To understand how a program works, it should be pos-
stead of mutation, thereby supporting part-wise in-place mutation
sible for reasoning and specification to be confined to
and local reasoning, while maintaining a simple type system. In
the cells that the program actually accesses. The value
the purest form of mutable value semantics, references are second-
of any other cell will automatically remain unchanged.
class: they are only created implicitly, at function boundaries, and
cannot be stored in variables or object fields. Hence, variables can There are two common ways to uphold local reasoning. One
never share mutable state. takes inspiration from pure functional languages and immutability.
Because references are often regarded as an indispensable tool Unfortunately, this paradigm may fail to capture the programmer’s
to write efficient programs, it is legitimate to wonder whether such mental model, or prove ill-suited to express and optimize some al-
a discipline can compete other approaches. As a basis for answer- gorithms [5], due to the inability to express in-place mutation.
ing that question, we demonstrate how a language featuring muta- Another approach aims to tame aliasing. Newer programming
ble value semantics can be compiled to efficient native code. This languages have successfully blended ideas from ownership types [1],
approach relies on stack allocation for static garbage collection and type capabilities [2], and region-based memory management [9]
leverages runtime knowledge to sidestep unnecessary copies. into flow-sensitive type systems, offering greater expressiveness
and giving more freedom to write efficient implementations. Un-
CCS CONCEPTS fortunately, these approaches have complexity costs that signifi-
cantly raise the entry barrier for inexperienced developers [10].
• Software and its engineering → Source code generation;
Mutable value semantics (MVS) offers a tradeoff that does not
Runtime environments; Language features.
add the complexity inherent to flow-sensitive type systems, yet
preserves the ability to express in-place, part-wise mutation. It does
KEYWORDS so treating references as a “second-class” concept. References are
mutable value semantics, local reasoning, native compilation only created at function boundaries by the language implementa-
tion, and only if the compiler can prove their uniqueness. Further,
ACM Reference Format: they can neither be assigned to a variable nor stored in object fields.
Dimitri Racordon, Denys Shabalin, Daniel Zheng, Dave Abrahams, and Bren- Hence, all values form disjoint topological trees, whose roots are
nan Saeta. 2021. Native Implementation of Mutable Value Semantics. In assigned to the program’s variables.
ICOOOLPS ’21, June 13, 2021, Online. ACM, New York, NY, USA, 4 pages.
The reader may understandably worry about expressiveness and
https://doi.org/10.1145/nnnnnnn.nnnnnnn
efficiency, as references are often held as indispensable for both
aspects. We note that a large body of software projects already ad-
dress expressiveness concerns empirically, such as the Boost Graph
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed Library[7], leveraging MVS to elucidate recurring questions sur-
for profit or commercial advantage and that copies bear this notice and the full cita- rounding equality, copies, and mutability, and develop generic data
tion on the first page. Copyrights for components of this work owned by others than structures and algorithms [8].
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission This paper focuses on the question of efficiency. We discuss an
and/or a fee. Request permissions from [email protected]. approach for compiling languages featuring MVS to native code,
ICOOOLPS ’21, July 13, 2021, Online relying on stack allocation for static garbage collection and using
© 2021 Association for Computing Machinery.
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00 runtime information to elide unnecessary copies. We present it in
https://doi.org/10.1145/nnnnnnn.nnnnnnn the context of a toy programming language, called MVSL, inspired
ICOOOLPS ’21, July 13, 2021, Online Trovato and Tobin, et al.
by Swift, for which we have written a compiler. Our implemen- Functions are declared with the keyword func followed by a list
tation is available as an open-source project hosted on GitHub: of typed parameters, a codomain, and a body. Functions are anony-
https://github.com/kyouko-taiga/mvs-calculus. mous but are first-class citizen values that can be assigned, passed
as an argument, or returned from other functions. Arguments are
2 A QUICK TOUR OF MVSL evaluated eagerly and passed by copy. Further, functions are al-
MVSL is a statically typed expression-oriented language, designed lowed to capture identifiers from their declaration environment.
to illustrate the core principles of MVS. In MVSL, a program is a Such captures also result in copies and stored in the function’s clo-
sequence of structure declarations, followed by a single expression sure, thus preserving value independence.
denoting the entry point (i.e., the main function in a C program). 1 var foo : Int = 42 in
A variable is declared with the keyword var followed by a name, 2 var fn : () -> Int {
a type annotation, an initial value, and the expression in which it 3 foo = foo + 1 in foo
is bound. A constant is declared similarly, with the keyword let. 4 } in
1 var foo : Int = 4 in 5 let bar = fn () in
2 let bar : Int = foo in bar 6 bar // foo is equal to 0
7 // bar is equal to 1
There are three built-in data types in the MVSL: Int for signed
integer values, Float for floating-point values, and a generic type To implement part-wise in-place mutation across function bound-
[T] for arrays of type T. In addition, the language supports two aries, values of parameters annotated inout can be mutated by the
kinds of user-defined types: functions and structures. A structure callee. At an abstract level, an inout argument is copied when the
is a heterogeneous data aggregate, composed of zero or more fields. function is called and copied back to its original location when the
Each field is typed explicitly and associated with a mutability qual- function returns.1 At a more operational level, an inout argument
ifier (let or var) that denotes whether it is constant or mutable. is simply passed by reference. Of course, inout extends to multiple
arguments, with one important restriction: overlapping mutations
1 struct Pair { are prohibited to prevent any writeback from being discarded.
2 var fs : Int ; var sn : Int
1 struct Pair { ... } in
3 } in
2 struct U {} in
4 var p: Pair = Pair (4 , 2) in p
3 let swap : ( inout Int , inout Int ) -> U
Fields of a structure can be of any type, but type definitions cannot 4 = ( a : inout Int , b : inout Int ) -> U {
be mutually recursive. Hence, all values have a finite representa- 5 let tmp = a in
tion. 6 a = b in
All types have value semantics. Thus, all values form disjoint 7 b = tmp in U ()
topological trees rooted at variables or constants. Further, all as- 8 } in
signments copy the right operand and never create aliases, depart- 9 var p = Pair (4 , 2) in
ing from the way aggregate data types typically behave in popular
10 _ = swap (& p . fs , & p . sn )
object-oriented programming languages, such as Python or Java.
11 in p // p is equal to Pair (2 , 4)
1 struct Pair { ... } in A more exhaustive specification of MVSL, as well as more elab-
2 var p: Pair = Pair (4 , 2) in orate program examples, are available in the GitHub repository.
3 var q: Pair = p in
4 q . sn = 8 in 3 NATIVE IMPLEMENTATION
5 p // p is equal to Pair (4 , 2) This section describes the strategy we implemented to compile
6 // q is equal to Pair (4 , 8) MVSL to native code.
Immutability applies transitively. All fields of a data aggregate
3.1 Memory representation
assigned to a constant are also treated as immutable by the type
system, regardless of their declaration. Int and Float are built-in numeric types that typically have a 1-
to-1 correspondence with machine types. Since struct definitions
1 struct Pair { ... } in cannot be mutually recursive, all values of a structure have a finite
2 let p: Pair = Pair (4 , 2) in memory representation (more on that later). Therefore, they can
3 p . sn = 8 in p // <- type error be represented as passive data structure (PDS), where each field is
laid out in a contiguous fashion.
Likewise, all elements of an array are constant if the array itself is In the absence of first-class references, it is fairly easy to iden-
assigned to a constant. tify the lifetime of a value: it begins when the value is assigned to
1 struct Pair { ... } in a variable and ends when said variable is reassigned or goes out
of scope. Following this observation, an obvious choice to handle
2 let a: [ Pair ] = [ Pair (4 ,2) , Pair (5 ,3)] in
3 a [0]. sn = 8 in a // <- type error 1 The Fortran enthusiast may think of the so-called “call-by-value/return” policy.
Native Implementation of Mutable Value Semantics ICOOOLPS ’21, July 13, 2021, Online
stack heap in the array are dynamically sized as well. In this case, a bitwise
copy would improperly create aliases on the heap-allocated mem-
𝜎 𝑟 𝑛 𝑘 𝑒 1 𝑒 2 · · · 𝑒𝑛 ory, breaking value independence. Instead, each element must be
copied individually, allocating new memory as necessary.
𝑘 = 𝑛 × sizeof (𝑇 ) One solution is to synthesize a function for each data type that
is applied whenever a copy should occur. If the type is trivial (i.e.,
Figure 1: In-memory representation of an array of 𝑇 it does not involve any dynamic allocation), then this function is
equivalent to a bitwise copy. Otherwise, it implements the appro-
priate logic, calling the copy function of each contained element.
memory is to rely on stack allocation, to automate memory man- Similarly, the logic implementing the destruction of a value can
agement. be synthesized into a destructor. If the type is trivial, then this de-
A type is trivial if it denotes a number or a composition of trivial structor is a no-op. Otherwise, it recursively calls the destructor
types (e.g., a pair of Ints). A variable of a trivial type represents a of each contained element and frees the memory allocated for all
single memory block allocated on the stack, which does not involve values being destroyed.
any particular operation to be initialized or deallocated.
Non-trivial types require more attention. In MVSL, arrays and
closures require dynamic allocation, because the compiler is in gen-
3.3 Crossing function boundaries
eral incapable of determining the number of elements in an array At function boundaries, PDSs are exploded into scalar arguments
or the size of a closure from their signatures. They can be repre- and passed directly through registers, provided the machine has
sented as fixed-size data aggregates that point to heap-allocated enough of them. If the structure is too large, it is passed as a pointer
memory, nonetheless. Furthermore, the aforementioned observa- to a stack cell in the caller’s context, in which a copy of the argu-
tion about lifetimes remains. Hence, the compiler can generate ment while have been stored before the call.
code to reclaim dynamically allocated memory when variables hold- An inout argument is passed as a (possibly interior) pointer. If
ing arrays or functions go out of scope or are reassigned. it refers to a local variable or one of its fields, then it is passed as a
An array is represented by a pointer 𝜎 to a contiguous block of pointer to the stack. If it refers to the element of an array, then it
heap-allocated memory. The block is structured as a tuple h𝑟, 𝑛, 𝑘, 𝑒i is passed as a pointer within the array’s storage.
where 𝑟 is a reference counter, 𝑛 denotes the number of elements in Note: the compiler can guarantee that the pointee can never be
the array, 𝑘 denotes the capacity of the array’s payload (i.e., the size outlived, because the language disallows the pointer to escape in
of its actual contents) and 𝑒 is a payload of 𝑘 bytes. The counter 𝑟 any way. In fact, the value of the pointer itself is not accessible. The
serves to implement the so-called copy-on-write optimization (see callee can only dereference it, either to store or load a value. Fur-
Section 3.4.2). Figure 1 depicts the in-memory representation of an ther, recall that the type system guarantees exclusive access to any
array of elements of some type 𝑇 . memory location. Hence, pointers representing inout arguments
The capacity 𝑘 of an array is typically different than of num- are known to be unique.
ber of its elements 𝑛, because the former depends on the size an
element in memory. For example, an array of 16-bit integer values 3.4 Avoiding unnecessary copies
[42, 1337] can be represented by a tuple h1, 2, 4, 42, 0, 5, 57i (assum- The implementation we have described so far generates a fair amount
ing a little-endian system). The array contains two elements, thus of memory traffic, as copies are created every time a value is as-
𝑛 = 2, yet its capacity 𝑘 = 4, since each element occupies two bytes. signed to a variable or passed as an argument. Much of this traffic
Closures use a PDS h𝜙, 𝜖, 𝑐, 𝑑i where 𝜙 is a pointer to a function is unnecessary, though, because most original values are destroyed
implementing the closure, 𝜖 is a pointer to the closure’s environ- immediately after being copied, or because copied values might
ment, and 𝑐 and 𝑑 are pointers to synthesized routines that respec- never be mutated and could have been shared. We now briefly dis-
tively copy and destroy the closure (see Section 3.2). cuss three techniques to eliminate unnecessary copies.
The function pointed by 𝜙 is obtained by defunctionalizing [6]
the closure. This process transforms the closure into a global func- 3.4.1 Move semantics. A recurring pattern is to assign values just
tion in which all captured identifiers are lifted into an additional after they have been created. For example, consider the expression
parameter for the closure’s environment. let x: [Int] = [1, 2] in f(x). The value of the array is as-
signed directly after its creation.
3.2 Copying and destroying values A naive implementation will evaluate the right operand, result-
Recall that assignments result in copies of their right operand. Since ing in the creation of a new array value, copy this value to assign
MVSL is a statically typed language, the compiler knows how to x and then destroy the original. Clearly, the copy is useless, since
copy values of fixed size. For trivial types, the operation consists the original value will never be used. Hence, one can move the tem-
of a mere bitwise copy of the right operand. porary value into the variable rather than copying it.
The situation is a bit more delicate for non-trivial types. For ar- Moving a value boils down to a bitwise copy. We said earlier
rays, a first issue is that the size of its heap-allocated storage can- that such a strategy was incorrect in the case of an array because
not be determined statically. Instead, it depends on the value of 𝑘 it would create aliases. In this particular case, however, the other
in the PDS that represents the array. A second issue is that copy- alias is discarded immediately and therefore the variable remains
ing may involve additional operations if the elements contained independent once the assignment is completed.
ICOOOLPS ’21, July 13, 2021, Online Trovato and Tobin, et al.
A similar situation occurs when arguments are being copied. In 4 MANAGED ENVIRONMENTS
the above expression, x must be copied before it is passed as an We did not discuss any strategy to execute MVSL in managed run-
argument to the function f. However, because the remainder of time environments, where stack allocation and interior pointers
the expression does not mention x anymore, this copy can be elided are typically unavailable. We note that the strategies we have pre-
and the value of x can be moved into the function. sented in Section 3.4 are applicable nonetheless. Move assignments
can be substituted by merely copying references, copy-on-write
3.4.2 Copy-on-write. Copies of immutable values to immutable can operate similarly in a managed environment and local reason-
bindings can obviously be elided. Indeed, aliasing is harmless in ing enables the same kind of optimizations.
the absence of mutation, and we can simulate value semantics on In addition, copies of large immutable structures can be avoided
top of shared immutable states. In contrast, assigning a mutable by memoizing them in a uniqueness table, segregated by data types
value to an immutable binding or vice versa typically requires a for efficient lookup. Intuitively, memoization should be particu-
copy, because the value might be mutated later. Similarly, assign- larly beneficial for programs that often test for equality.
ing a mutable value to a mutable binding also requires a copy. One important challenge relates to inout arguments. A naive
Nonetheless, it is possible that neither the original nor the copy solution consists of boxing every field and every element into a
end up being actually mutated, perhaps because the mutation de- distinct object. Unfortunately, this approach should be likely inef-
pends on a condition that is evaluated at runtime. In this case, un- ficient, due to the loss of cache locality. A cleverer strategy could
fortunately, the compiler must conservatively assume that a muta- represent inout parameters as writeable keypaths (i.e., closures al-
tion will occur and perform a copy to preserve value independence. lowing write access to a specific path in a data structure). We leave
One simple mechanism can be used to workaround this appar- further investigation on that front to future work.
ent shortcoming: copy-on-write. Copy-on-write leverages runtime
knowledge to delay copies until they are actually needed. Heap- 5 CONCLUSION
allocated storage is associated with a counter that keeps track of We present an approach to compile programming languages fea-
the number of pointers to that storage. Every time a value is copied, turing mutable value semantics into native code. We rely heav-
an alias is created and the counter is incremented. The value of this ily on stack allocation to implement static garbage collection, and
counter is checked when mutation actually occurs, at runtime. If it insert calls to synthesized destructors to deallocate dynamically-
is greater than one, the counter is decremented, the storage is du- sized values automatically. Furthermore, we leverage copy-on-write
plicated and the mutation is performed on a copy. Otherwise, the to elide unnecessary memory traffic at runtime.
mutation is performed on the original.
The counter is decreased whenever the destructor of a value REFERENCES
referring to the associated storage is called. If it reaches zero, then [1] Dave Clarke, Johan Östlund, Ilya Sergey, and Tobias Wrigstad. 2013. Own-
the contents of the storage are destroyed and deallocated. ership Types: A Survey. In Aliasing in Object-Oriented Programming. Types,
Analysis and Verification, Dave Clarke, James Noble, and Tobias Wrigstad (Eds.).
Lecture Notes in Computer Science, Vol. 7850. Springer, New York, NY, 15–58.
3.4.3 Leveraging local reasoning. We cited O’Hearn et al. [4] in https://doi.org/10.1007/978-3-642-36946-9_3
[2] Philipp Haller and Martin Odersky. 2010. Capabilities for Uniqueness and Bor-
the introduction to emphasize the importance of local reasoning rowing. In European Conference on Object-Oriented Programming (Lecture Notes
for human developers. We add that local reasoning is also an in- in Computer Science, Vol. 6183), Theo D’Hondt (Ed.). Springer, New York, NY,
valuable tool for automated program optimizations, as it eliminates 354–378. https://doi.org/10.1007/978-3-642-14107-2_17
[3] Chris Lattner and Vikram S. Adve. 2004. LLVM: A Compilation Frame-
the need for conservative assumptions about the use of memory. In work for Lifelong Program Analysis & Transformation. In International Sym-
particular, one can easily identify and discard irrelevant mutations, posium on Code Generation and Optimization. IEEE, San Jose, CA, USA, 75–88.
because one can assume those cannot be observed elsewhere. https://doi.org/10.1109/CGO.2004.1281665
[4] Peter W. O’Hearn, John C. Reynolds, and Hongseok Yang. 2001. Local Reasoning
about Programs that Alter Data Structures. In Computer Science Logic (Lecture
Notes in Computer Science, Vol. 2142), Laurent Fribourg (Ed.). Springer, New York,
1 struct Pair { ... } in NY, 1–19. https://doi.org/10.1007/3-540-44802-0_1
[5] Melissa E. O’Neill. 2009. The Genuine Sieve of Eratos-
2 var p: Pair = Pair (4 , 2) in thenes. Journal of Functional Programming 19, 1 (2009), 95–106.
3 let q: Pair = p in https://doi.org/10.1017/S0956796808007004
[6] John C. Reynolds. 1998. Definitional Interpreters for Higher-Order Program-
4 p . fs = 8 in ming Languages. Higher-Order and Symbolic Computation 11, 4 (1998), 363–397.
5 Pair (p . sn , q . fs ). https://doi.org/10.1023/A:1010027404223
[7] Jeremy Siek, Lie-Quan Lee, and Andrew Lumsdaine. 2002. The Boost Graph Li-
brary: User Guide and Reference Manual. Addison-Wesley Longman Publishing
Co., Inc., USA.
Consider the above program. Thanks to local reasoning, an opti- [8] Alexander A. Stepanov and Daniel E. Rose. 2014. From Mathematics to Generic
mizer can safely discard the assignment to p.fs at line 4, because Programming (1st ed.). Addison-Wesley Professional, Boston, MA.
its effect is never observed. Without this assignment, it becomes [9] Mads Tofte, Lars Birkedal, Martin Elsman, and Niels Hallen-
berg. 2004. A Retrospective on Region-Based Memory Manage-
clear that p and q are the exact same value, and the former’s copy ment. Higher-Order and Symbolic Computation 17, 3 (2004), 245–265.
can be elided. Eventually, constant propagation will deduce that https://doi.org/10.1023/B:LISP.0000029446.78563.a4
[10] Jonathan Turner. 2017. Rust 2017 Survey Results.
the program is equivalent to the expression Pair(2, 4). https://blog.rust-lang.org/2017/09/05/Rust-2017-Survey-Results.html. [Online;
Note: such optimizations are fairly standard in off-the-shelf op- accessed 08-April-2021].
timizers. Our own implementation simply relies on the default op-
timization passes of LLVM [3].