0% found this document useful (0 votes)
96 views117 pages

MIT6 035S10 Lec05

The document describes how intermediate representations (IRs) are used in program analysis and compilation. It discusses how high-level and low-level IRs preserve different aspects of a program's structure and semantics to enable various analysis and optimization tasks. It then provides an example of how an object-oriented vector class and its methods would be represented and executed at a low-level IR.

Uploaded by

Ahmad Abba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views117 pages

MIT6 035S10 Lec05

The document describes how intermediate representations (IRs) are used in program analysis and compilation. It discusses how high-level and low-level IRs preserve different aspects of a program's structure and semantics to enable various analysis and optimization tasks. It then provides an example of how an object-oriented vector class and its methods would be represented and executed at a low-level IR.

Uploaded by

Ahmad Abba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 117

MIT 6.

035
Intermediate Formats

Martin Rinard
Laboratory for Computer Science
Massachusetts Institute of Technology
Program
g Representation
p Goals
• Enable Prog
gram Analyysis and Transformation
– Semantic Checks, Correctness Checks, Optimizations
• Structure Translation to Machine Code
– Sequence of Steps

Semantic
Analysis High Level Low Level
Parse Machine
Intermediate Intermediate
Tree Code
Representation Representation
High
g Level IR
• Preserves Object Structure
• Preserves Structured Flow of Control
• Primary Goal: Analyze Program
Low Level IR
• Moves Data Model to Flat Address Space
• Eliminates Structured Control Flow
• Suitable for Low Level Compilation Tasks
– Register Allocation
– Instruction Selection
Examples
p of Object
j Representation
p
and Program Execution
(Thi happens
(This h when
h program runs))
Example
p Vector Class
class vector {
int v[];
void add(int x) {
int i;
i = 0;
while (i < v.length) { v[i] = v[i]+x; i = i+1; }
}
}
Representing
p g Arrays
y
• Items Stored Contiguously
g y In Memory
y
• Length Stored In First Word

3 7 4 8

• Color Code
– Red - generated by compiler automatically
– Blue, Yellow, Lavender - program data or code
– Magenta - executing code or data
Representing
p g Vector Objects
j
• First Word Points to Class Information
– Method Table, Garbage Collector Data
• Next Words Have Object Fields
– For vectors, Next Word is Reference to Array

Class Info

3 7 4 8
Invoking
g Vector Add Method
vect.add(1);
( )
• Create Activation Record

Class
l Info
f

3 7 4 8
Invoking
g Vector Add Method
vect.add(1);
( )
• Create Activation Record this
– this onto stack

Class
l Info
f

3 7 4 8
Invoking
g Vector Add Method
vect.add(1);
( )
• Create Activation Record this
1 x
– this onto stack
– parameters onto stack

Class
l Info
f

3 7 4 8
Invoking
g Vector Add Method
vect.add(1);
( )
• Create Activation Record this
1 x
– this onto stack
i
– parameters onto stack
– space for locals on stack

Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;
i = i+1;
}
Class
l Info
f

3 7 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]+x;
v[i] [] ;
i = i+1;
}
Class
l Info
f

3 8 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
0 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 8 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
1 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 8 4 8
Executing
g Vector Add Method
void add(int x) {
int i; this
i = 0; 1 x
3 i
while (i < v.length)
[ ] = v[i]
v[i] [ ]+x;;
i = i+1;
}
Class
l Info
f

3 8 5 9
What does the compiler have to do
to make all of this work?
Compilation
p Tasks
• Determine Format of Objects
j and Arrays
y
• Determine Format of Call Stack
• Generate Code to Read Values
– this, parameters, locals, array elements, object fields
• G
Generate
t CCode
d to
t Evaluate
E l t Expressions
E i
• Generate Code to Write Values
• Generate Code for Control Constructs
Further Complication
p - Inheritance

Object Extension
Inheritance Example
p - Point Class
class p
point {
int c;
int getColor() { return(c); }
int distance() { return(0); }
}
Point Subclasses
class cartesianPoint extends point{
int x, y;
int distance() { return(x*x + y*y);
y y }
}
class ppolarPoint extends point
p {
int r, t;
int distance() { return(r
return(r*r);
r); }
int angle() { return(t); }
}
Implementing
p g Object
j Fields
• Each objject is a contigguous piece of memoryy
• Fields from inheritance hierarchy allocated

sequentially in piece of memory


memory

• Example: polarPoint object

Class Info
c 2 polarPoint
l P i
r 1
t 2
Point Objects
j
Class Info
c 2 ppoint

Class Info
c 1 cartesianPoint
x 4
y 6

Class Info
c 4 polarPoint
l P i
r 1
t 3
Compilation
p Tasks
• Determine Object
j Format in Memoryy
– Fields from Parent Classes
– Fields from Current Class
• Generate Code for Methods
– Field,
Field Local Variable and Parameter Accesses
– Method Invocations
Symbol Tables - Key Concept in
C
Compilation
il ti
•• Compiler Uses Symbol Tables to Produce
– Object Layout in Memory
– Code to
• Access Object Fields
• Access
A L
Locall Variables
V i bl
• Access Parameters
• Invoke
k Methods
h d
Symbol Tables During Translation
F
From P
Parse T
Tree tto IR
IR

• Symbol Tables Map Identifiers (strings) to

Descriiptors (i
(infformatiion ab
bout identifi
id
ifiers))
• Basic Operation: Lookup
– Given A String, find Descriptor
– Typical
yp Impplementation: Hash Table
• Examples
– Given a class name,
name find class descriptor
– Given variable name, find descriptor
• local descriptor, parameter descriptor, field descriptor
Hierarchy
y In Symbol
y Tables
• Hierarchy
y Comes From
– Nested Scopes - Local Scope Inside Field Scope
– Inheritance - Child Class Inside Parent Class
• Symbol Table Hierarchy Reflects These
Hierarchies
• Lookup Proceeds Up Hierarchy Until
Descriptor is Found
Hierarchy
y in vector add Method
Symbol Table for Fields
off vector Class
Cl
v descriptor for field v
Symbol Table for
Parameters of add

x ddescriptor
i t for
f parameter
t x
this descriptor for this
Symbol
y Table for
Locals of add

i descriptor for local i


Lookup
p In vector Example
p
• v[i] = v[i]+x;
v descriptor for field v

x ddescriptor
i t for
f parameter
t x
this descriptor for this

i descriptor for local i


Lookup
p i In vector Example
p
• v[i] = v[i]+x;
v descriptor for field v

x ddescriptor
i t for
f parameter
t x
this descriptor for this

i descriptor for local i


Lookup
p i In vector Example
p
• v[i] = v[i]+x;
v descriptor for field v

x ddescriptor
i t for
f parameter
t x
this descriptor for this

i descriptor for local i


Lookup
p x In vector Example
p
• v[i] = v[i]+x;
v descriptor for field v

x ddescriptor
i t for
f parameter
t x
this descriptor for this

i descriptor for local i


Lookup
p x In vector Example
p
• v[i] = v[i]+x;
v descriptor for field v

x ddescriptor
i t for
f parameter
t x
this descriptor for this

i descriptor for local i


Lookup
p x In vector Example
p
• v[i] = v[i]+x;
v descriptor for field v

x ddescriptor
i t for
f parameter
t x
this descriptor for this

i descriptor for local i


Descriptors
p
• What do descriptors
p contain?
• Information used for code generation and
semantic analysis
– local descriptors - name, type, stack offset
– field descriptors - name,
name type,
type object offset
– method descriptors
• signature (type of return value,
value receiver,
receiver and parameters)
• reference to local symbol table
• reference to code for method
Program
g Symbol
y Table
• Maps
p class names to class descriptors
p
• Typical Implementation: Hash Table

vector class descriptor for vector


point class descriptor for point
cartesianPoint class descriptor for cartesianPoint
polarPoint class descriptor for polarPoint
Class Descriptor
p
• Has Two Symbol
y Tables
– Symbol Table for Methods
• Parent Symbol Table is Symbol Table for Methods of
Parent Class
– Symbol Table for Fields
• Parent Symbol Table is Symbol Table for Fields of
Parent Class
• Reference
R f to
t Descriptor
D i t off Parent
P t Class
Cl
Class Descriptors for point and
class descriptor cartesianPoint
t i P i t
for point
c field descriptor for c
method descriptor
getColor
for getColor
distance
method descriptor
for distance

x field descriptor for x


y field descriptor for y

class descriptor distance method descriptor


f cartesianPoint
for i i for distance
Field, Parameter and Local and
T
Type Descriptors

D i t
• Field, Parameter and Local Descrip
ptors Refer to
Type Descriptors
– Base typ
ype descripptor: int,,boolean
– Array type descriptor, which contains reference to
type descriptor for array elements
– Class descriptor
• Relatively Simple Type Descriptors
• Base Type Descriptors and Array Descriptors
Stored in Type Symbol Table
Example
p Type
yp Symbol
y Table

int int descriptor


int [] array descriptor
boolean boolean descriptor
b l
boolean [] array descriptor
vector [] array descriptor class descriptor
vector for vector
Method Descriptors
p
• Contain Reference to Code for Method
• Contain Reference to Local Symbol Table for
Local Variables of Method
• Parent Symbol Table of Local Symbol Table is
Parameter Symbol Table for Parameters of
Method
Method Descriptor
p for add Method
field symbol table
parameter for vector class
symbol table

x parameter
t descriptor
d i t
this this descriptor
Method
descriptor llocall variable
i bl
for add symbol table

i local descriptor

code for add method


Symbol Table Summary
• P
Program SSymbol
b lT Table
bl (Class
(Cl Descriptors)
D i )
• Class Descriptors
– Field Symbol Table (Field Descriptors)
• Field Symbol Table for SuperClass
– Method
e od SSymbol
y bo Table ( e od Descriptors)
b e (Method esc p o s)
• Method Symbol Table for Superclass
• Method Descriptors
– Local Variable Symbol Table (Local Variable Descriptors)
• Parameter Symbol Table (Parameter Descriptors)
– Field Symbol Table of Receiver Class
• Local, Parameter and Field Descriptors
– Type Descriptors in Type Symbol Table or Class Descriptors
v field descriptor

x parameter descriptor
class descriptor add
for vector
thi
this this
hi descriptor
d i

i local descriptor

code for add method

int int descriptor


i []
int class decl
class_decl
array descriptor
boolean boolean descriptor
vector field_decl
boolean [] array descriptor
vector [] array descriptor int v []
field symbol table
v field descriptor
parameter
symbol table
x parameter descriptor
class descriptor add
for vector
thi
this this
hi descriptor
d i
method
symbol table
method
descriptor i local descriptor
for add local symbol table
code for add method
type
symbol
table int int descriptor
i []
int class decl
class_decl
array descriptor
boolean boolean descriptor
vector field_decl
boolean [] array descriptor
vector [] array descriptor int v []
Translating from Abstract Syntax
T
Trees to
t Symbol
S b l Tables
T bl
Example Abstract Syntax Tree
class vector {
int v[];
void add(int x) {
int i; i = 0;
while (i < v.length) { v[i] = v[i]+x; i = i+1; }
}
} class decl
class_decl

vector field_decl method_decl statements

int v add param_decl var_decl


int x int i
class_decl

vector field_decl method_decl statements

class symbol int v add param_decl


param decl var_decl
var decl
table
int x int i
class_decl

vector field_decl method_decl statements

vector int v add param_decl


param decl var_decl
var decl
class symbol int x int i
table

class descriptor
for vector
class_decl

vector field_decl method_decl statements

vector int v add param_decl


param decl var_decl
var decl
class symbol int x int i
table

v field descriptor

class descriptor
for vector
class_decl

vector field_decl method_decl statements

vector int v add param_decl


param decl var_decl
var decl
class symbol int x int i
table

v field descriptor

this this descriptor


class descriptor add
for vector

Method
descriptor
for add
class_decl

vector field_decl method_decl statements

vector int v add param_decl


param decl var_decl
var decl
class symbol int x int i
table

v field descriptor

x parameter descriptor
class descriptor add this this descriptor
for vector

Method
descriptor
for add
class_decl

vector field_decl method_decl statements

vector int v add param_decl


param decl var_decl
var decl
class symbol int x int i
table

v field descriptor

x parameter descriptor
class descriptor add this this descriptor
for vector

Method
i local descriptor
p
descriptor
for add
Representing Code in High
High-Level
Level
Intermediate Representation
Basic Idea
• Move towards assembly y language
g g
• Preserve high-level structure
– object format
– structured control flow
– distinction between parameters,
parameters locals and fields
• High-level abstractions of assembly language
– lloadd andd store nodes
d
– access abstract locals, parameters and fields, not
memory locations
l ti directly
di tl
Representing
p g Expressions
p
• Expression Trees Represent Expressions
– Internal Nodes - Operations like +, -, etc.
– Leaves - Load Nodes Represent Variable Accesses
• Load Nodes
– ldf node for field accesses - field descriptor
• (implicitly
(i li itl accesses this ld add
thi - could dd a reference
f t accessedd object)
to bj t)
– ldl node for local variable accesses - local descriptor
– ldp node for parameter accesses - parameter descriptor
– lda node for array accesses
• expression tree for array
• expression tree for index
Example
p
x*x + yy*yy +
* *

ldf ldf ldf ldf

field descriptor for x field descriptor for y


i field
in fi ld symbol
b l table
t bl i field
in fi ld symbol
b l table
t bl
for cartesianPoint class for cartesianPoint class
Example
p
v[i]+x
[]
+

ld
lda ld
ldp

ldf ldl
parameter descriptor
for x in parameter symbol
local descriptor table of vector add
field descriptor
p for v
f i in
for i local
l l symbolb l
in field symbol table
table of vector add
for vector class
Special Case: Array Length
O
Operator
t
• len node represents
p length
g of arrayy
– expression tree for array
• Example: v.length
v length
len

ldf

field descriptor for v


in field symbol table
for vector class
Representing Assignment
• Store Nodes
St
Statements
t t
– stf for stores to fields
• field descriptor
• expression tree for stored value
– stl for stores to local variables
• local descriptor
• expression
i tree for
f storedd value
l
– sta for stores to array elements
• expression tree for array
• expression tree for index
• expression tree for stored value
Representing
p g Procedure Calls
• Call statement
• Refers to method descriptor for invoked method
• Has list of parameters (this is first parameter)
vect.add(1) call

method descriptor for ldl


add in method symbol
y table
constant
for vector class
local descriptor
for vect in local symbol
t bl off method
table th d containing
t i i the
th
1
call statement vect.add(1)
Example
p
sta v[i]=v[i]+x

ldf ldl +

lda ldp

field descriptor for v ldl


parameter descriptor
in field symbol table
for x in parameter symbol
for vector class local descriptor table of vector add
for i in local symbol
table of vector add
Representing
p g Flow of Control
• Statement Nodes
– sequence nodde - fi
first statement, next statement
– if node
• expressiion tree for condition
di i

• then statement node and else statement node


– while
hil node
d
• expression tree for condition
• statement node for loop body
– return node
• expression tree for return value
Example
p
while ((i < v.length)
g )
v[i] = v[i]+x;
while

< sta
+
ldl len ldf ldl lda ldp
ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
From Abstract Syntax
y Trees to
Intermediate Representation
while ((i < v.length)
g )
v[i] = v[i]+x;

field descrip
ptor for v local descrip
ptor for i parameter descripptor for x

while ((i < v.length)


g )
v[i] = v[i]+x;

ldl

field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

ldl
ldf

field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

ldl len
ldf

field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

<

ldl len
ldf

field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

<

ldl len
ldf
ldf
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

<

ldl len
ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

<

ldl len lda


ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

<

ldl len lda ldp


ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

<
+
ldl len lda ldp
ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

<
+
ldl len ldf lda ldp
ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

<
+
ldl len ldf ldl lda ldp
ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;

< sta
+
ldl len ldf ldl lda ldp
ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;
while

< sta
+
ldl len ldf ldl lda ldp
ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
while ((i < v.length)
g )
v[i] = v[i]+x;
while

< sta
+
ldl len ldf ldl lda ldp
ldf
ldf ldl
field descriptor
p for v local descriptor
p for i p
parameter descriptor
p for x
Abbreviated Notation
while ((i < v.length)
g )
v[i] = v[i]+x;
while

< sta
+
ldl i len ldf v ldl i lda ldp x
ldf v
ldf v ldl i
From Abstract Syntax
y Trees to IR
• Recursively
y Traverse Abstract Syntax
y Tree
• Build Up Representation Bottom-Up Manner
– Look Up Variable Identifiers in Symbol Tables
– Build Load Nodes to Access Variables
– Build Expressions Out of Load Nodes and Operator
Nodes
– Build Store Nodes for Assignment Statements
– Combine Store Nodes with Flow of Control Nodes
Summary
High-Level Intermediate Representation
• Goal: represent program in an intuitive way that
supports future compilation tasks
• Representing
p g program
p g data
– Symbol tables
– Hierarchical organization
• Representing computation
– Expression trees
– Various types of load and store nodes
– Structured flow of control
• Traverse abstract syntax tree to build IR
Dynamic
y Dispatch
p
Which distance method is
if (x == 0) {
invoked?
p = new point();
} else if (x < 0) { • if p is a point
p = new cartesianPoint();
t i P i t() return(0)
} else if (x > 0) { • if p is a cartesianPoint
p = new polarPoint(); return(x*x
eu ( + y*y)
y y)
} • if p is a polarPoint
y = pp.distance();
(); return(r*r)
( )
• Invoked Method Depends
on Type of Receiver!
g Dynamic
Implementing
p y Dispatch
p
• Basic Mechanism: Method Table
method table for getColor method for point
point objects distance method for ppoint

method table for getColor method for point


t i P i t objects
cartesianPoint bj t distance method for cartesianPoint

method table for getColor method for point


polarPoint objects distance method for polarPoint
angle method for polarPoint
Invoking
g Methods
• Compiler
p Numbers Methods In Each
Inheritance Hierarchy
– getColor
g is Method 0,, distance is Method 1,,
angle is Method 2
• Method Invocation Sites Access Corresponding
p g
Entry in Method Table
• Works For Single Inheritance Only
– not for multiple inheritance, multiple dispatch, or
interfaces
Hierarchy in Method Symbol Tables
f P
for Points
i t
method descriptor
getColor
for getColor
distance
method descriptor
for distance

method
th d descriptor
d i t
distance distance for distance
angle
g method descriptor
method descriptor for angle
for distance
Lookup
p In Method Symbol
y Tables
• Starts with method table of declared class of

receiver object

• Goes up class hierarchy until method found


– point p; p = new point(); p.distance();
• finds distance in point method symbol table
– point p; p = new cartesianPoint(); p.distance();
• finds distance in point method syymbol table
– cartesianPoint p; p = new cartesianPoint();

pg
p.getColor();

()
• finds getColor in point method symbol table
Static Versus Dynamic
y Lookupp
• Static lookup
p done at comppile time for type
yp
checking and code generation
• Dynamic lookup done when program runs to
dispatch method call
• Static and dynamic lookup results may differ!
differ!

– point p; p = new cartesianPoint(); p.distance();

• Static lookup finds distance in point method table

• Dynamic lookup invokes distance in cartesianPoint class


• Dynamic dispatch mechanism used to make this happen
Static and Dynamic
y Tables
• Static Method Symbol Table
– Used to look up method definitions at compile time
– Index is method name
– Lookup starts at method symbol table determined
by declared type of receiver object
– Lookup may traverse multiple symbol tables
• Dynamic Method Table
–UUsed
d tto look
l k up method
th d to
t invoke
i k att run time
ti
– Index is method number
– Lookup
k simply
i l accesses a single
i l table
bl element
l
getColor method
for point
distance method method descriptor
for point getColor for getColor
distance
method descriptor
getColor method f di
for distance
t
for point
method descriptor
distance method for distance
for cartesianPoint

getColor method
distance
for point
distance method distance angle
for polarPoint
method descriptor
angle method for method descriptor for angle
polarPoint
l i f di
for distance

class_decl

vector field_decl method_decl statements

vector int v add param_decl


param decl var_decl
var decl
class symbol int x int i
table

v field descriptor

x parameter descriptor
class descriptor add this this descriptor
for vector

Method
i local descriptor
p
descriptor
for add code for add method
Eliminating
g Parse Tree Construction
• Parser actions build symbol
y tables
– Reduce actions build tables in bottom-up fashion
– Actions correspond
p to activities that take pplace in
top-down fashion in parse tree traversal
• Eliminates intermediate construction of parse
tree - improves performance
• Also less code to write (but code may be harder
to write than if just traverse parse tree)
class vector { int v[]; void add(int x) { int i; ... }}

class symbol
table
class vector { int v[]; void add(int x) { int i; ... }}

field_decl

int v
class symbol
table

field descriptor
class vector { int v[]; void add(int x) { int i; ... }}

field_decl

i v
int param decl
param_decl
class symbol
table int x

field descriptor

parameter descriptor
class vector { int v[]; void add(int x) { int i; ... }}

field_decl

i v
int param decl var_decl
param_decl var decl
class symbol
table int x int i

field descriptor

parameter descriptor

local descriptor
p
class vector { int v[]; void add(int x) { int i; ... }}

statements
field_decl

i v
int param decl var_decl
param_decl var decl
class symbol
table int x int i

field descriptor

parameter descriptor

local descriptor
p
code for add method
class vector { int v[]; void add(int x) { int i; ... }}

statements
field_decl method_decl

i v
int add param_decl
param decl var_decl
var decl
class symbol
table int x int i

field descriptor

x parameter descriptor
this this descriptor

Method
i local descriptor
p
descriptor
for add code for add method
class vector { int v[]; void add(int x) { int i; ... }}

statements
field_decl method_decl

i v
int add param_decl
param decl var_decl
var decl
class symbol
table int x int i

v field descriptor

x parameter descriptor
class descriptor add this this descriptor
for vector

Method
i local descriptor
p
descriptor
for add code for add method
Nested Scopes
p
• So far, have seen several kinds of nesting
g
– Method symbol tables nested inside class symbol
tables
– Local symbol tables nesting inside method symbol
tables
• Nesting disambiguates potential name clashes
– Same name used for class field and local variable
– Name refers to local variable inside method
Nested Code Scopes
p
• Symbol
y tables can be nested arbitrarilyy deeply
py
with code nesting:
class bar {
Note: Name clashes
baz x;
with nesting can
int foo(int x) {
reflect programming
double x = 5.0;
error. Compilers
C il often
ft
{ float x = 10.0;
generate warning messages
{ int x = 1; ... x ...}
if it occurs.
... x ...
}
... x ...
}
What is a Parse Tree?
• Parse Tree Records Results of Parse
• External nodes are terminals/tokens
• Internal nodes are non-terminals
non terminals

class_decl::=‘class’ name ‘{’field_decl method_decl‘}’


field_decl::= ‘int’ name ‘[];’
method_decl::= ‘void’ name ‘(’ param_decl ‘) ’
‘{‘ var_decl stats ‘}’
Abstract Versus Concrete Trees
• Remember g
grammar hacks
– left factoring, ambuguity elimination, precedence of
binary operators
• Hacks lead to a tree that may not reflect
p
cleanest interpretation of program
p g
• May be more convenient to work with abstract
syntax tree (roughly,
(roughly parse tree from grammar
before hacks)
Building
g IR Alternatives
• Build concrete pparse tree in parser,
p translate to
abstract syntax tree, translate to IR
• Build abstract syntax tree in parser, translate to
IR
• Roll IR construction into parsing
FromAbstract Syntax Trees to
Symbol Tables
• Recursively
y Traverse Tree
• Build Up Symbol Tables As Traversal Visits
Nodes
Traversing
g Class Declarations
• Extract Class Name and Superclass Name
• Create Class Descriptor (field and method symbol
tables), Put Descriptor Into Class Symbol Table
• Put Arrayy Descriptor
p Into Type yp Symbol
y Table
• Lookup Superclass Name in Class Symbol Table,
Make Superclass
p Link in Class Descriptor
p Point to
Retrieved Class Descriptor
• Traverse Field Declarations to Fill Up Field Symbol
Table
• Traverse Method Declarations to Fill Up Method
Symbol Table
MIT OpenCourseWare
http://ocw.mit.edu

6.035 Computer Language Engineering


Spring 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

You might also like