0% found this document useful (0 votes)
14 views120 pages

Lecture24 26 Runtime Environment 2

The document discusses the runtime environment in compiler design, focusing on memory management, storage allocation, and the interaction between the compiler, operating system, and hardware. It covers various aspects such as stack allocation, heap management, garbage collection, and the structure of activation records. Additionally, it addresses how high-level programming constructs are represented in memory and the challenges associated with encoding different data types and functions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views120 pages

Lecture24 26 Runtime Environment 2

The document discusses the runtime environment in compiler design, focusing on memory management, storage allocation, and the interaction between the compiler, operating system, and hardware. It covers various aspects such as stack allocation, heap management, garbage collection, and the structure of activation records. Additionally, it addresses how high-level programming constructs are represented in memory and the challenges associated with encoding different data types and functions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 120

CMPE 152

Compiler Design

Lecture 23 - 25
Ahmed Ezzat

Runtime Environment

1 Ahmed Ezzat
Outline

⚫ Runtime Environment
--------- page 42
⚫ Implementing Objects
⚫ Implementing Dynamic Type Checking --------- page 84

⚫ Summary --------- page 120

2 CMPE 152 Ahmed Ezzat


Runtime Environment

3 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

⚫ What are the issues:


❑ How do your code
and data look like
during execution?
❑ Interaction among
compiler, OS, and
target machine H/W
❑ The main two themes:
– Allocation of storage locations
– Memory Mgmt & Access to variables and data
4 CMPE 152 Ahmed Ezzat
Runtime Environment Overview

⚫ Memory Management:
❑ Stack allocation
❑ Heap management: portion of the store used for data
that lives indefinitely
❑ Garbage collection: process of finding spaces within
the heap that are no longer used and reallocate them
to other data items

5 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

⚫ Memory Management:

6 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

7 CMPE 152 Ahmed Ezzat


Runtime Environment Overview:
How a Program Run?

⚫ Where to put code and data?


⚫ How to access code and data?
⚫ How to pass parameters to a procedure or
function?
⚫ How to return results?

8 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

Typical Storage Allocation:

Code
Stack Allocation: Static Data
❑ For managing procedure calls and
interrupt handling Heap
❑ Stack grows (stack frame) with each call
and shrinks with each procedure
Free Memory
return/terminate
❑ Each procedure call pushes an
activation record (stack frame) into Stack
the stack. Each return from a procedure
removes record (stack frame) from the
stack
9 CMPE 152 Ahmed Ezzat
Runtime Environment Overview

⚫ Code Generation:
❑ Calling sequence
– Code that allocates activation record / stack frame
– Code for entering information in it
❑ Return sequence
– Code to restore the state of the machine

10 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

⚫ Program related definitions:


❑ The entire program is a procedure (main)
❑ A function is a procedure with return values.
Procedure has input but no return value.
❑ Procedure calls can be nested
❑ A procedure has a name, a body and some
parameters:
❖ Formal parameters: those in definition
❖ Actual parameters: those passed, arguments

11 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

❑ A program activation: an execution of the


program
❑ Lifetime of an activation: sequence of steps
from the first statement to the last statement
❑ Recursive: new activation will begin before an
earlier activation of the same procedure has
ended
❑ Activation tree: each node is an activation of a
procedure, root is main

12 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

⚫ Code Generation:
❑ Calling sequence
– Code that allocates activation record / stack frame
– Code for entering information in it
❑ Return sequence
– Code to restore the state of the machine

13 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

⚫ Program related definitions:


❑ The entire program is a procedure (main)
❑ A function is a procedure with return values
Procedure has input but no return value.
❑ Procedure calls can be nested
❑ A procedure has a name, a body and some
parameters:
❖ Formal parameters: those in definition
❖ Actual parameters: those passed, arguments

14 CMPE 152 Ahmed Ezzat


Runtime Environment Overview

❑ A program activation: an execution of the


program
❑ Lifetime of an activation: sequence of steps
from the first statement to the last statement
❑ Recursive: new activation will begin before an
earlier activation of the same procedure has
ended
❑ Activation tree: each node is an activation of a
procedure, root is main

15 CMPE 152 Ahmed Ezzat


Runtime Environment:
Control Stack and Scope

⚫ Control stack: one way to implement


procedure activations
⚫ Scope: a portion of a program that a name
declaration applies:
– Local name: within a procedure
– Non-local name: can use name outside of a
procedure

16 CMPE 152 Ahmed Ezzat


Runtime Environment:
Storage Organization

⚫ How a program is laid out in memory


– Target code
– Data objects {static, dynamic (Heap)} Code
– Control stack
Static Data
⚫ Typical layout
– Code Heap
– Data (static)
– Stack Free Memory

– …
– Heap (dynamic) Stack

17 CMPE 152 Ahmed Ezzat


Runtime Environment:
Code Segment

⚫ Actual machine instructions


– Arithmetic / logical instructions
– Comparison instructions
– Branches (short distance)
– Jumps (long distance)
– Load/store registers
– Data movements
– Constant manipulations
⚫ Code segment is often write-protected (R-X), so
running code cannot overwrite itself

18 CMPE 152 Ahmed Ezzat


Runtime Environment:
Data Segment

⚫ The data objects whose size is known at compile


time

⚫ The data objects whose lifetime is the full run of


the program, not just during a function invocation

19 CMPE 152 Ahmed Ezzat


Runtime Environment:
Two Types of Data in the Data Segment

⚫ Static Data: things won’t change, can be write


protected
– E.g., string literals, arithmetic literals
#define MYNAME “Ahmed Ezzat”
⚫ Global Data: global variables
– E.g., “static variables” in C, which value retains
the same when used next time in a different
function invocation

20 CMPE 152 Ahmed Ezzat


Runtime Environment:
Heap: Dynamic Data

⚫ Data created by malloc() or new()


⚫ Size not known at compile time
⚫ Heap data lives until de-allocated or until
program exit/ends
⚫ Garbage collection: Automatically de-allocate
the data no longer used

21 CMPE 152 Ahmed Ezzat


Runtime Environment:
Runtime Stack

⚫ Data used for function invocation is saved


in the stack (in a stack frame)
⚫ Basic idea: when a call occurs, everything
about the current activation is saved on the
stack
– Register values, program counters, data
⚫ When a function return, restore the data
saved on the stack
22 CMPE 152 Ahmed Ezzat
Runtime Environment Linking

⚫ From re-locatable code to executable


⚫ One or more .o files get linked into a single
.exe file
⚫ All external symbols get resolved
⚫ Code and data of each .o get relocated into
one big address space of .exe file, starting at 0

23 CMPE 152 Ahmed Ezzat


Runtime Environment Linking

⚫ Static linking: all code gets copied into one


big .exe file (including all library files)

⚫ Dynamic linking: some external routines


(mostly library routines) are not copied to
.exe. Only load when get called.

24 CMPE 152 Ahmed Ezzat


Runtime Environment Linking

⚫ From re-locatable code to executable


⚫ One or more .o files get linked into a single
.exe file
⚫ All external symbols get resolved
⚫ Code and data of each .o get relocated into
one big address space of .exe file, starting at 0

25 CMPE 152 Ahmed Ezzat


Runtime Environment Loading

⚫ .exe into memory


⚫ Code and data get relocated so that lowest
address in code segment is load into
memory
⚫ Operating System: handles memory
management
⚫ Computer Architecture: actual execution of
instructions

26 CMPE 152 Ahmed Ezzat


Runtime Environment Profilers

⚫ A profiler gather information about


execution of a program:
– Implemented similarly as a debugger
– Add instructions in the target program to call
counters
– May use statistical sampling
⚫ Stop executions periodically, see what function is called

27 CMPE 152 Ahmed Ezzat


An Important Duality

● Programming languages contain high-level structures:


● Functions
● Objects Exceptions
● Dynamic typing
● Lazy evaluation
● (etc.)

● The physical computer only operates in terms of several


primitive operations:
● Arithmetic
● Data movement
● Control jumps
28 CMPE 152 Ahmed Ezzat
Runtime Environment
● We need to come up with a representation of these high-
level structures using the low-level structures of the
machine.
● A runtime environment is a set of data structures
maintained at runtime to implement these high-level
structures.
● e.g., the stack, the heap, static area, virtual function
tables, etc.
● Strongly depends on the features of both the source and
target language. (e.g compiler vs. cross-compiler)
● Our IR generator will depend on how we set up our
runtime environment.
29 CMPE 152 Ahmed Ezzat
The Decaf* Runtime Environment
● Need to consider:
● What do objects look like in memory?
● What do functions look like in memory?
● Where in memory should they be placed?
● There are no right answers to these questions:
● Many different options and tradeoffs.
● We will see several approaches.

* Decaf: Dynamic Executable Code Analysis Framework


30 CMPE 152 Ahmed Ezzat
Data Representations

● What do different data types look like in memory?


● Machine typically supports only limited types:
● Fixed-width integers: 8-bit, 16-bit- 32-bit,
signed, unsigned, etc.
● Floating point values: 32-bit, 64-bit, 80-bit IEEE
754.
● How do we encode (define) our object types using
these types?

31 CMPE 152 Ahmed Ezzat


Encoding Primitive Types

● Primitive integral types (byte, char, short, int, long,


unsigned, uint16_t, etc.) typically map directly to the
underlying machine type.
● Primitive real-valued types (float, double, long
double) typically map directly to underlying machine
type.
● Pointers typically implemented as integers holding
memory addresses.
● Size of integer depends on machine architecture;
hence 32-bit compatibility mode on 64-bit machines.

32 CMPE 152 Ahmed Ezzat


Encoding Arrays

33 • Dynamic Arrays: C++ vector, ArrayList in Java, etc. Ahmed Ezzat


Encoding Multidimensional Arrays

⚫ Often represented as an array of arrays. Shape


depends on the array type used.
⚫ C-style 2-dimensional arrays:
int a[3][2];

34 CMPE 152 Ahmed Ezzat


Encoding Multidimensional Arrays

⚫ Often represented as an array of arrays. Shape


depends on the array type used. a[ ][ ]

⚫ C-style arrays:
int a[3][2];

a[0][0] a[0][1] a[1][0] a[1][1] a[2][0] a[2][1]

35 CMPE 152 Ahmed Ezzat


Encoding Multidimensional Arrays

⚫ Often represented as an array of arrays. Shape


depends on the array type used.
⚫ C-style arrays:
int a[3][2];
a[0][0] a[0][1] a[1][0] a[1][1] a[2][0] a[2][1]

Array of size 2 Array of size 2 Array of size 2

36 CMPE 152 Ahmed Ezzat


Encoding Multidimensional Arrays

⚫ Often represented as an array of arrays. Shape


depends on the array type used. How do you know
where to look for an
⚫ C-style arrays: element in an array like
int a[3][2]; this?

a[0][0] a[0][1] a[1][0] a[1][1] a[2][0] a[2][1]

Array of size 2 Array of size 2 Array of size 2

37 CMPE 152 Ahmed Ezzat


Encoding Multidimensional Arrays

⚫ Often represented as an array of arrays.


⚫ Shape depends on the array type used.
⚫ Java-style arrays:
int[][] a = new int [3][2];

38 CMPE 152 Ahmed Ezzat


Encoding Multidimensional Arrays

⚫ Often represented as an array of arrays.


⚫ Shape depends on the array type used.
⚫ Java-style arrays:
int[][] a = new int [3][2];
3

a[0]

a[1]

a[2]

39 CMPE 152 Ahmed Ezzat


Encoding Multidimensional Arrays

⚫ Often represented as an array of arrays.


⚫ Shape depends on the array type used.
⚫ Java-style arrays:
int[][] a = new int [3][2];
3 2 a[0][0] a[0][1]

a[0]

a[1] 2 a[1][0] a[1][1]

a[2]

40 2 a[2][0] a[2][1]
CMPE 152 Ahmed Ezzat
Encoding Functions

● Many questions to answer:


● What does the dynamic execution of functions look
like?
● Where is the executable code for functions located?
● How are parameters passed in and out of functions?
● Where are local variables stored?

● The answers strongly depend on what the language


supports.

41 CMPE 152 Ahmed Ezzat


Review: The Stack

● Function calls are often implemented using a stack of


activation records (or stack frames).
● Calling a function pushes a new activation record onto
the stack.
● Returning from a function pops the current activation
record from the stack.
● Questions:
● Why does this work?
● Does this always work?

42 CMPE 152 Ahmed Ezzat


Activation Trees

● An activation tree is a tree structure representing all of


the function calls (flow of execution of a program) made
by a program on a particular execution.
● Depends on the runtime behavior of a program;
can't always be determined at compile-time.
● (The static equivalent is the call graph).
● Each node in the tree is an activation record.
● Each activation record stores a control link to the
activation record of the function that invoked it.
43 CMPE 152 Ahmed Ezzat
Activation Trees Left to right the activation tree

44 CMPE 152 Ahmed Ezzat


Activation Trees

⚫ Activation Tree:
❑ Model procedure activations
❑ The main() is the root
❑ Children of the same parent are executed in
sequence from left to right
❑ Sequence of procedure calls → preorder (Root →
left subtree → right subtree) traversal of activation
tree
❑ Sequence of procedure returns → postorder (left
subtree → right subtree → Root) traversal of
activation tree
45 CMPE 152 Ahmed Ezzat
Activation Records

⚫ Activation Records:
❑ What is pushed into the
stack for each procedure
activation / call
❑ Contents vary with the
language being implemented

* Control Link: activation record of the function that invoked it


46 CMPE 152 Ahmed Ezzat
Activation Trees

int main() {
Fib(3);
}

int Fib(int n) {
if (n <= 1) return n;
return Fib(n – 1) + Fib(n – 2);
}

47 CMPE 152 Ahmed Ezzat


Activation Trees

int main() {
Fib(3); main
}

int Fib(int n) {
if (n <= 1) return n;
return Fib(n – 1) + Fib(n – 2);
}

48 CMPE 152 Ahmed Ezzat


Activation Trees main

Fib
int main() {
Fib(3); n = 3
}
int Fib(int n) {
if (n <= 1) Fib Fib
return n; n = 2 n = 1
return Fib(n – 1) +
Fib(n – 2);
}
Fib Fib
n = 1 n = 0

49 CMPE 152 Ahmed Ezzat


Activation Trees

An activation tree is a spaghetti stack.

The runtime stack is an optimization


of this spaghetti stack.

50 CMPE 152 Ahmed Ezzat


Assumption

● Once a function returns, its activation record


cannot be referenced again.
● We don't need to store old nodes in the activation
tree.
● Every activation record has either finished executing or
is an ancestor of the current activation record.

51 CMPE 152 Ahmed Ezzat


Breaking an Assumption

⚫ Assumption: “Once a function returns, its


activation record cannot be referenced again.”
⚫ Any ideas on how to break this?
⚫ Closure: is a block of code that may capture
variable value from its surrounding scope!
⚫ With Closure we broke the above assumption

* Closure: A closure is the combination of a function bundled together


(enclosed) with references to its surrounding state. We can access the
surrounding state after returning from the function!
52 CMPE 152 Ahmed Ezzat
Breaking the Assumption: Example

● “Once a function returns, its activation record


cannot be referenced again.”
● Closure
function CreateCounter() {
var counter = 0;
# the return statement return function() address to the caller
# of CreateCounter() which in turn can call this function()
# and gets back the counter value
return function() {
counter ++;
return counter;
}
53 } CMPE 152 Ahmed Ezzat
Breaking the Assumption: Example

● “Once a function returns, its activation record


cannot be referenced again.”
● Closure
function CreateCounter() {
var counter = 0;
# the return statement return function() address to the caller of CreateCounter()
# which in turn can call this function() and gets back the counter value
return function() {
counter ++;
return counter;
}
}
54 CMPE 152 Ahmed Ezzat
Closures

function CreateCounter() {
var counter = 0;
return function() {
counter ++;
return counter;
}
}
function MyFunction() {
f = CreateCounter(); # returns a function address
print(f()); # print the counter value
print(f());
}
* On return from f(), the activation record that holds the counter value is
gone but with the above code we can access the counter variable across
55 function calls!
Closures

function CreateCounter() {
var counter = 0;
return function() {
counter ++; return counter;
}
}
function MyFunction() {
f = CreateCounter();
print(f());
print(f());
}

>

56 Ahmed Ezzat
MyFunction
Closures

function CreateCounter() {
var counter = 0;
return function() {
counter ++; return counter;
}
}
function MyFunction() {
f = CreateCounter();
print(f());
print(f());
}

>

57 Ahmed Ezzat
MyFunction
Closures

function CreateCounter() { CreateCounter


var counter = 0;
return function() { counter = 0
counter ++; return counter;
}
}
function MyFunction() {
f = CreateCounter();
print(f());
print(f());
}

>

58 Ahmed Ezzat
MyFunction
Closures f = <fn>

function CreateCounter() { CreateCounter


var counter = 0;
return function() { counter = 0
counter ++; return counter;
}
}
function MyFunction() {
f = CreateCounter();
print(f());
print(f());
}

>

59 Ahmed Ezzat
MyFunction
Closures f = <fn>

function CreateCounter() { CreateCounter


var counter = 0;
return function() { counter = 2
counter ++; return counter;
}
}
function MyFunction() {
f = CreateCounter(); <fn> <fn>
print(f()); // counter = 1
print(f()); // counter = 2
}

> 1
2

60 Ahmed Ezzat
Control and Access Links

● The control link of a function is a pointer to the


function that called it.
● Used to determine where to resume execution after the
function returns.
● The access link of a function is a pointer to the activation
record in which the function was created.
● Used by nested functions to determine the location
of variables from the outer scope.

61 CMPE 152 Ahmed Ezzat


So What?

● Even a concept as fundamental as “the stack” is


actually quite complex.
● When designing a compiler or programming language,
you must keep in mind how your language features
influence the runtime environment.
● Always be critical of the languages you use!

62 CMPE 152 Ahmed Ezzat


Functions in Decaf

● We use an explicit runtime stack.


● Each activation record needs to hold:
● All of its parameters.
● All of its local variables.
● All temporary variables introduced by the IR generator
(more on that later).

63 CMPE 152 Ahmed Ezzat


Parameters Passing Approaches

● Two common approaches.


● Call-by-value
● Parameters are copies of the values specified as
arguments.
● Call-by-reference:
● Parameters are pointers to values specified as
parameters.

64 CMPE 152 Ahmed Ezzat


Summary of Function Calls

● The runtime stack is an optimization of the activation


tree spaghetti stack.
● Most languages use a runtime stack, though certain
language features prohibit this optimization.
● Activation records logically store a control link to the
calling function and an access link to the function in
which the activation record was created.
● Call-by-value and call-by-name (reference) can be
implemented using copying and pointers.
65 CMPE 152 Ahmed Ezzat
Implementing Objects

66 CMPE 152 Ahmed Ezzat


Objects are Hard

● It is difficult to build an expressive and


efficient object-oriented language.
● Certain concepts are difficult to implement efficiently:
● Dynamic dispatch (virtual functions)
● Interfaces
● Multiple Inheritance
● Dynamic type checking (i.e., instanceof)
● Interfaces are so tricky to get right.

67 CMPE 152 Ahmed Ezzat


Encoding C-Style structs

68 CMPE 152 Ahmed Ezzat


Accessing Fields

● Once an object is laid out in memory, it's just a series


of bytes.
● How do we know where to look to find a particular field?
4 Bytes 1 3 Bytes 8 Bytes

● Idea: Keep an internal table inside the compiler


containing the offsets of each field.
● To look up a field, start at the base address of the object
and advance forward by the appropriate offset.
69 CMPE 152 Ahmed Ezzat
Field Lookup

struct MyStruct {
int x;
char y;
double z;
};

4 Bytes 1 3 Bytes 8 Bytes

MyStruct* ms = new MyStruct;


ms->x = 137; store 137 0 bytes after ms
ms->y = 'A’; store 'A' 4 bytes after ms
ms->z = 2.71 store 2.71 8 bytes after ms
70 CMPE 152 Ahmed Ezzat
OOP Without Methods

● Consider the following Decaf code:


class Base {
int x;
int y;
}
class Derived extends Base {
int z;
}
● What will Derived look like in memory?

71 CMPE 152 Ahmed Ezzat


Memory Layout with Inheritance

72 CMPE 152 Ahmed Ezzat


Field Lookup with Inheritance

class Base {
int x;
int y;
};
class Derived extends Base {
4 Bytes 4 Bytes int z;
};

4 Bytes 4 Bytes 4 Bytes

73 CMPE 152 Ahmed Ezzat


Field Lookup with Inheritance
class Base {
int x;
int y;
};
class Derived extends Base {
4 Bytes 4 Bytes int z;
};

4 Bytes 4 Bytes 4 Bytes

Base ms = new Base;


ms.x = 137;
ms.y = 42;
74 CMPE 152 Ahmed Ezzat
Field Lookup With Inheritance
class Base {
int x;
int y;
};
class Derived extends Base {
4 Bytes 4 Bytes int z;
};

4 Bytes 4 Bytes 4 Bytes

Base ms = new Base;


ms.x = 137; store 137 0 bytes after ms
ms.y = 42; store 42 4 bytes after ms
75 CMPE 152 Ahmed Ezzat
Field Lookup With Inheritance

class Base {
int x;
int y;
};
class Derived extends Base {
4 Bytes 4 Bytes int z;
};

4 Bytes 4 Bytes 4 Bytes

Base ms = new Base;


ms.x = 137;
ms.y = 42;
76 CMPE 152 Ahmed Ezzat
Field Lookup With Inheritance

class Base {
int x;
int y;
};
class Derived extends Base {
4 Bytes 4 Bytes int z;
};

4 Bytes 4 Bytes
Base ms = new Derived;4 Bytes
ms.x = 137; store 137 0 bytes after ms
ms.y = 42; store 42 4 bytes after ms

77 CMPE 152 Ahmed Ezzat


Field Lookup With Inheritance
class Base {
int x;
int y;
};
4 Bytes 4 Bytes class Derived extends Base { int
int z;
};
4 Bytes 4 Bytes 4 Bytes

78
this is tricky

● Inside a member function, the name this


refers to the current receiver object.

This information (pun (humorous) intended) needs to be
communicated into the function.
● Idea: Treat this as an implicit first parameter.
● Every n-argument member function is really an (n+1) -
argument member function whose first parameter is the
this pointer.

79 CMPE 152 Ahmed Ezzat


this is Clever

class MyClass {
int x;
void myFunction(int arg) {
this.x = arg; # x belongs to MyClass
}
}

MyClass m = new MyClass;


m.myFunction(137);

80 CMPE 152 Ahmed Ezzat


this is Clever:
Get function declaration outside the class

class MyClass {
int x;
}
# myFunction() belongs to MyClass but is implemented outside the class declaration
void MyClass::myFunction(MyClass this, int arg){
this.x = arg; # x belongs to MyClass
}

MyClass m = new MyClass;


m.myFunction(137);

81 CMPE 152 Ahmed Ezzat


this is Clever

class MyClass {
int x;
}
# myFunction() belongs to MyClass but is implemented outside the class declaration
void MyClass::myFunction(MyClass this, int arg){
this.x = arg;
}

MyClass m = new MyClass;


m.myFunction(137);
82 CMPE 152 Ahmed Ezzat
this is Clever

class MyClass {
int x;
}
void MyClass::myFunction(MyClass this, int arg){
this.x = arg;
}

MyClass m = new MyClass;


MyClass::myFunction(m, 137);

83 CMPE 152 Ahmed Ezzat


this Rules

● When generating code to call a member function,


remember to pass some object as the this parameter
representing the receiver object.
● Inside of a member function, treat this as just another
parameter to the member function.
● When implicitly referring to a field of this, use this extra
parameter as the object in which the field should be
looked up.

84 CMPE 152 Ahmed Ezzat


Implementing Dynamic Dispatch

● A class X can be a subtype of chain of classes, where


each implements a specific function func(). When class
X invoke func(), which one to execute?

● Dynamic dispatch means calling a function at runtime


based on the dynamic type of an object, rather than its
static type.

● How do we set up our runtime environment so that we


can efficiently support this?
85 CMPE 152 Ahmed Ezzat
An Initial Idea

● At compile-time, get a list of every defined class.


● To compile a dynamic dispatch, emit IR code for the
following logic:

if (the object is type A)


call A's version of the function
else if (the object is type B)
call B's version of the function

else if (the object is type N)
86 call N's version of the function. Ahmed Ezzat
Analyzing our Approach


This previous idea has several serious problems.

What are they?
● It's slow.
● Number of checks is O(C), where C is the number of
classes the dispatch might refer to.
● Gets slower the more classes there are.
● It's infeasible in most languages:

What if we link across multiple source files? What if
● we support dynamic class loading?
87 CMPE 152 Ahmed Ezzat
An Observation

● When laying out fields in an object, we gave every


field an offset.
● Derived classes have the base class fields in the same
order at the beginning.
Layout of Base Base.x Base.y

Layout of Derived Base.x Base.y Derived.z

● Can we do something similar with functions?

88 CMPE 152 Ahmed Ezzat


Virtual Function Tables

89 CMPE 152 Ahmed Ezzat


Virtual Function Tables

Code for
Base.sayHi

Code for
Derived.sayHi

90 CMPE 152 Ahmed Ezzat


Virtual Function Tables

91 Ahmed Ezzat
Virtual Function Tables

92 CMPE 152 Ahmed Ezzat


Derived

Virtual Function Tables

93 CMPE 152 Ahmed Ezzat


Virtual Function Tables

● A virtual function table (or vtable) is an array of


pointers to the member function implementations for a
particular class.
● To invoke a member function:
● Determine (statically) its index in the vtable.
● Follow the pointer at that index in the object's vtable
to the code for the function.
● Invoke that function.

94 CMPE 152 Ahmed Ezzat


Analyzing our Approach

● Advantages:
● Time to determine function to call is O(1)
(which is good!)
● What are the disadvantages?
● Object sizes are larger:
Each object needs to have space for O(M) pointers.
● Object creation is slower:
Each new object needs to have O(M) pointers set,
where M is the number of member functions.
95 CMPE 152 Ahmed Ezzat
A Common Optimization

Entry for Class Base in the Class Descriptor

Entry for Class Derived in the Class Descriptor

96 CMPE 152 Ahmed Ezzat


A Common Optimization

Class descriptor

Vtable

Class descriptor

Vtable

97 CMPE 152 Ahmed Ezzat


Objects in Memory

Obj1 of Base Class descriptor

Obj1 has its own state

Obj2 of Base Class descriptor

Obj2 has its own state

Obj1 of Derived Class descriptor

Obj2 of Derived Class descriptor

98 CMPE 152 Ahmed Ezzat


CMPE 152
A General Inhertiance Framework

● Each object stores a pointer to a class descriptor for its class.


● Each class descriptor stores:
● A pointer to the base class state.
● A pointer to a method lookup table.
● To invoke a method:
● Follow the pointer to the method table.
● If the method exists, call it.
● Otherwise, navigate to the base class and repeat.
● This is slow but can be optimized in many cases; we'll
99 see this later. CMPE 152 Ahmed Ezzat
Vtables and Interfaces

100 CMPE 152 Ahmed Ezzat


CMPE 152
Vtables and Interfaces

101 CMPE 152 Ahmed Ezzat


Vtables and Interfaces

102 Ahmed Ezzat


Interfaces with Vtables

● Interfaces complicate vtable layouts because they


require interface methods to have consistent positions
across all vtables.
● This can fill vtables with useless entries (refer to Null
entry in Paint vtable.
● For this reason, interfaces are typically not implemented
using pure vtables.
● We'll see two approaches for implementing interfaces
efficiently.
103 CMPE 152 Ahmed Ezzat
Implementing Dynamic
Type Checking

104 CMPE 152 Ahmed Ezzat


Dynamic Type Checks

● Many languages require some sort of dynamic


type checking.
● Java's instanceof, C++'s dynamic_cast, any
dynamically-typed language.
● May want to determine whether the dynamic type is
convertible to some other type, i.e., not whether the type
is equal.
● How can we implement this?

105 CMPE 152 Ahmed Ezzat


A Pretty Good Approach

B C

D E

106 CMPE 152 Ahmed Ezzat


A Pretty Good Approach

B C

D E

107 CMPE 152 Ahmed Ezzat


A Pretty Good Approach

B C

D E

108 CMPE 152 Ahmed Ezzat


A Pretty Good Approach

B C

D E

109 CMPE 152 Ahmed Ezzat


A

B C
Simple Dynamic Type
D E
Checking
● Have each object's vtable store a pointer to its base
(not parent) class. Base class for object E is class A
● To check if an object is convertible to type S at
runtime, follow the parent pointers embedded in the
object's vtable upward until we find S or reach a type
with no parent (base class).

Runtime is O(d), where d is the depth of the class in
the hierarchy.
● Can we make this faster?
110 CMPE 152 Ahmed Ezzat
A

decreasing B C

A Marvelous Idea D E

● There is a fantastically clever way of checking


convertibility at runtime in O(1), assuming there are O(1)
classes in a hierarchy.
● Assume:
● There aren't “too many” classes derived from any
one class (say, 10).
● A runtime check of whether an object that is statically
of type E is dynamically of type C (or A) is only
possible if E ≤ C.
● All types are known statically.

111 CMPE 152 Ahmed Ezzat


A

B C

A Marvelous Idea D E F G

112 CMPE 152 Ahmed Ezzat


A Marvelous Idea

113 CMPE 152 Ahmed Ezzat


A

B C

A Marvelous Idea D E F G

Key = own Prime * parents’ primes

Prime = 1, key =1

Prime = 2, key =2x1 Prime = 3, key =3x1

Prime = 5, key = 5x2 Prime = 7, key = 7x2 Prime = 11, key = 11x3 Prime = 13, key = 13x3

114 CMPE 152 Ahmed Ezzat


A Marvelous Idea

115 CMPE 152 Ahmed Ezzat


A

B C

A Marvelous Idea D E F G

C myObject = /* … */
if (myObject instanceof A) {
/* … */
}
116 CMPE 152 Ahmed Ezzat
A Marvelous Idea

Object F is an instance of C
Object G is an instance of C

F myObject = new F; /* Check if myObject is of type C */


if (myObject->vtable.key % 3 == 0) {
/* … */
}
117 CMPE 152 Ahmed Ezzat
Dynamic Typing through Primes

● Assign each class a unique prime number (increasing as


you go down and from left-to-right):
● (Can reuse primes across unrelated type hierarchies.)
● Set the key for that class to be the product of its prime and all
the primes of its superclasses.
● To check at runtime if an object is convertible to type T:
● Look up the object's key.
● If T's key divides the object's key, the object is convertible to T.
Otherwise, it is not.
● Assuming product of primes fits into an integer, can do
this check in O(1).
● Also works with multiple inheritance; prototype C++
118 implementations using this techinique exist.
Ahmed Ezzat
7. Summary

⚫ We covered the main tasks in Runtime


Environment
⚫ We covered Implementing Objects including
encoding structs, accessing fields, ember
functions, and virtual function tables
⚫ Finally, we covered Implementing Dynamic type
checking

119 CMPE 152 Ahmed Ezzat


END

120 CMPE 152 Ahmed Ezzat

You might also like