Cs 3007 Inter Code Gen
Cs 3007 Inter Code Gen
Assistant Professor1,2
Computer Science and Engineering Department
National Institute of Technology Rourkela
{pynes,mishrat}@nitrkl.ac.in
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 1 / 42
Front and back ends
Intermediate code
Machine independent representation of source code
Simplifies conversion from source code to target code
Analysis synthesis - syntax and semantic analysis on source code
Front end - intermediate code from source code
Back end - target code from intermediate code
Intermediate representation
Combines front end of i with back end of machine j
m × n compilers - by writing m front ends and n back ends
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 2 / 42
Implementation
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 3 / 42
Syntax trees
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 5 / 42
DAG construction
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 6 / 42
DAG construction - Value-Number Method
INPUT: Label op, node l, and node r.
OUTPUT: The value number of a node in the array with signature
(op, l, r ).
METHOD: Search the array for a node M with label op, left child l, and
right child r.
1 If there is such a node, return the value number of M.
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 7 / 42
Data Structure - Value-Number Method
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 8 / 42
Three-address Code
A linearized representation of a syntax tree or a DAG in which explicit
names correspond to the interior nodes of the graph
There is at most one operator on the right side of an instruction; that
is, no built-up arithmetic expressions are permitted
Three-address code for the expression: a + a ∗ (b − c) + (b − c) ∗ d
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 10 / 42
Three-address Code - Instructions
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 11 / 42
Three-address Code - Instructions
8 Indexed copy instructions
x = y [i] sets x to value in location i memory units beyond location y
x[i] = y sets contents of location i units beyond x to value of y
8 Address and pointer assignments
x = &y sets the r-value of x to be the location (l-value) of y
x = ∗y - r-value of x is made equal to contents of location pointed by y
∗x = y sets the r-value of the object pointed to by x to the r-value of y
l-value left side of assignments
r-value right side of assignments
l-value - location, r-value - content
Variable x: l-value is memory location &x, r-value is content at &x
Constant x: r-value is x, no l-value
Pointer ∗x: l-value is &x, r-value is location pointed by x
x[i]: l-value is &x[i] or x + i, r-value is x[i] or i[x] or ∗(x + i)
Common “l-value required” errors: 5 = x and x + y = z
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 12 / 42
An Example
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 13 / 42
Quadruples
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 14 / 42
An Example
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 15 / 42
Triples
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 16 / 42
Indirect Triples
In triples the result of an operation is referred to by its position
Moving an instruction may change all references to that result
Indirect triples solve this problem
Indirect triples consist of a listing of pointers to triples
Uses an array to list pointers to triples in the desired order
(
x1 , if flag is true
φ(x1 , x2 ) =
x2 , if flag is false
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 18 / 42
Types and Declarations
Type checking
Uses rules to reason about the behavior of a program at run time
Ensures that types of operands match type expected by an operator
For example, the && operator in Java expects its two operands to be
booleans; the result is also of type boolean
Translation Applications
From type of a name, a compiler determines the storage needed by it
at run time
Type information is needed to calculate the address denoted by an
array reference
To insert explicit type conversions
To choose the right version of an arithmetic operator
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 19 / 42
Type Expressions
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 20 / 42
Definition of type expressions
A basic type is a type expression.
Basic types in a programming language - boolean, char, integer, float,
and void
A type name is a type expression.
Type expression can be formed by applying array type constructor to
a number and a type expression
A type expression record is a data structure with named fields.
Formed by applying the record type constructor to the field names
and their types.
A type expression can be formed using type constructor → for
function types. s → t denotes “function from type s to type t”
If s and t are type expressions, then their Cartesian product s × t is a
type expression. Used to represent a list or tuple of types (e.g. for
function parameters).
Type expressions may contain variables whose values are type
expressions.
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 21 / 42
Type Equivalence
Two types are structurally equivalent iff one of the following is true
They are the same basic type
They are formed by applying the same constructor to structurally
equivalent types
One is a type name that denotes the other
For type names
first two conditions in above definition lead to name equivalence of
type expressions
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 22 / 42
Declarations
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 23 / 42
Storage Layout for Local Names
Type of a name determines amount of storage it needs at run time
At compile time, amount of storage assigns a relative address to each
name. Type and relative address are saved in symbol-table entry for
the name
Data of varying length, such as strings or dynamic arrays whose size
cannot be determined until run time are handled reserving fixed
amount of storage for a pointer to the data
Run-time storage management deals with data of varying length
Consider storage in blocks of contiguous bytes where a byte is the
smallest unit of addressable memory
A byte is eight bits, and some number of bytes form a machine word
Multibyte objects are stored in consecutive bytes and given the
address of first byte
Width of a type is number of storage units needed for objects of that
type. A basic type, such as a char, int, or , float requires an integral
number of bytes. For easy access, storage for aggregates such as
arrays and classes is allocated in one contiguous block of bytes
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 24 / 42
Computing Storage for Local Names
SDT for computing type and width of basic and array types
SDT uses synthesized attributes type and width for each nonterminal
Variables t and w pass type and width information from a B node in
a parse tree to the node for C → .
Action between B and C sets t to B.type and w to B.width
If B → int then B.type is set to integer and B.width is set to 4
If B → float then B.type is set to float and B.width is set to 8
C → then t becomes C .type and w becomes C .width
Otherwise, C specifies an array component
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 25 / 42
Syntax-directed translation of array types
The parse tree for the type int[2][3] shown by dotted lines
The solid lines show how the type and width are passed
The variables t and w are assigned the values of B.type and B.width
Then the subtree with the C nodes is examined
The values of t and w are used at the node for C → to start
evaluation of synthesized attributes up the chain of C nodes
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 26 / 42
Sequences of Declarations
In C and Java declarations in a procedure is processed as a group
Distributed declarations in Java procedure, processed during analysis
Use a variable offset to keep track of next relative address
Following SDT deals with a sequence of declarations of the form T id
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 30 / 42
Example - Generating 3-address code using gen
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 31 / 42
Addressing Array Elements
Array elements in consecutive locations for quick access
In C and Java, n array elements are indexed 0, 1, · · · , n − 1
Address of A[i] is base + i × w
base is relative address of A[0]
w is width of each array element
For 2 − d A[n1 ][n2 ], address of A[i1 ][i2 ] is base + (i1 × n2 + i2 ) × w
For k − d A[n1 ][n2 ] · · · [nk ], address of A[i1 ][i2 ] · · · [ik ] is
base + ((· · · ((i1 × n2 + i2 ) × n3 + i3 ) · · · ) × nk + ik ) × w
In general 1 − d array index numbers low , low + 1, · · · , high
Number of elements, n = high − low + 1
base = addr (A[low ])
addr (A[i]) = base + (i − low ) × w
Let addr (A[i]) = i × w + c
Compile time precalculation c = base − low × w
c = base when low = 0
c is saved in symbol table entry for A
Hence using c from symbol table relative address of A[i] is i × w + c
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 32 / 42
Layouts of 2-d Array
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 35 / 42
Type Checking
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 36 / 42
Type Conversion
How to solve expression x + i ? - x is float and i is int
Conversion from int to float: t1 = (float) 2; t2 = t1 * 3.14;
Conversion is implicit (coercions) if done by compiler
Conversion is explicit (casts) if done by programmer
In Java, widening conversions preserve information and narrowing
conversions can loose information
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 38 / 42
Type Conversions into Expression Evaluation
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 39 / 42
Overloading of Functions and Operators
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 40 / 42
Type Inference and Polymorphic Functions
Polymorphic - any code fragment executed with arguments of
different types
Parametric polymorphism by parameters or type variables
Example - ML program for finding length of a list
fun length(x) =
if null(x) then 0 else length(tl(x)) + 1;
length([00 sun00 ,00 mon00 ,00 tue 00 ]) + length([10, 9, 8, 7])
evaluates to 3 + 4 = 7
Abstract syntax tree for the function definition
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 41 / 42
Thank you
S. Pyne1 & T. K. Mishra2 (NITRKL) Aut’20, Compiler Design (CS 3007) September 22, 2020 42 / 42