Compiler Unit 5 Notes
Compiler Unit 5 Notes
UNIT - V
CODE GENERATOR
Code generator converts the intermediate representation of source code into a form that can be readily
executed by the machine. A code generator is expected to generate a correct code. Designing of code
generator should be done in such a way so that it can be easily implemented, tested and maintained.
P:=Q+R
S:=P+T
MOV Q, R0
ADD R, R0
MOV R0, P
MOV P, R0
ADD T, R0
MOV R0, S
Here the fourth statement is redundant as the value of the P is loaded again in that statement that
just has been stored in the previous statement. It leads to an inefficient code sequence. A given
intermediate representation can be translated into many code sequences, with significant cost
differences between the different implementations. A prior knowledge of instruction cost is needed
in order to design good sequences, but accurate cost information is difficult to predict.
As the number of variables increase, the optimal assignment of registers to variables becomes difficult.
Mathematically, this problem becomes NP-complete. Certain machine requires register pairs consist of an
even and next odd-numbered register. For example
M a, b
These types of multiplicative instruction involve register pairs where a, the multiplicand is an even register
and b, the multiplier is the odd register of the even/odd register pair.
6. Evaluation order :
The code generator decides the order in which the instruction will be executed. The order of
computations affects the efficiency of the target code. Among many computational orders, some will
require only fewer registers to hold the intermediate results. However, picking the best order in
general case is a difficult NP-complete problem.
7. Approaches to code generation issues: Code generator must always generate the correct code. It is
essential because of the number of special cases that a code generator might face. Some of the design
goals of code generator are:
• Correct
• Easily maintainable
• Testable
• Maintainable
BASIC BLOCKS
A basic block is a sequence of consecutive statements in which flow of control enters at the
beginning and leaves at the end without halt or possibility of branching except at the end. The
following sequence of three-address statements forms a basic block: t1 := a*a t2 := a*b t3 :=
2*t2 t4 := t1+t3 t5 := b*b t6 := t4+t5
A three-address statement x := y+z is said to define x and to use y or z. A name in a basic block
is said to live at a given point if its value is used after that point in the program, perhaps in
another basic block.
The following algorithm can be used to partition a sequence of three-address statements into
basic blocks.
Algorithm 1: Partition into basic blocks.
Input: A sequence of three-address statements.
Output: A list of basic blocks with each three-address statement in exactly one block.
Method:
1. We first determine the set of leaders, the first statements of basic blocks.
The rules we use are the following: I)
The first statement is a leader.
II) Any statement that is the target of a conditional or unconditional goto is a leader.
III) Any statement that immediately follows a goto or conditional goto statement is a leader.
2. For each leader, its basic block consists of the leader and all statements up to but not
including the next leader or the end of the program.
Example 3: Consider the fragment of source code shown in fig. 7; it computes the dot product
of two vectors a and b of length 20. A list of three-address statements performing this
computation on our target machine is shown in fig. 8.
begin
prod := 0;
i := 1; do
begin
prod := prod + a[i] * b[i];
i := i+1; end
while i<=
20 end
Let us apply Algorithm 1 to the three-address code in fig 8 to determine its basic blocks.
statement (1) is a leader by rule (I) and statement (3) is a leader by rule (II), since the last
statement can jump to it. By rule (III) the statement following (12) is a leader. Therefore,
statements (1) and (2) form a basic block. The remainder of the program beginning with
(1) prod := 0
(2) i := 1
(3) t1 := 4*i
(4) t2 := a [ t1 ]
(5) t3 := 4*i
(6) t4 :=b [ t 3 ]
(7) t5 := t2*t 4
(8) t6 := prod +t5
(9) prod := t6
(10) t7 := i+1
(11) i := t 7
(12) if i<=20 goto (3)
on the right, the second statement redefines b. Therefore, the value of b in the 3rd
statement is different from the value of b in the 1st, and the 1st and 3rd statements do not
compute the same expression.
2. Dead-code elimination
Suppose x is dead, that is, never subsequently used, at the point where the statement x:=
y+z appears in a basic block. Then this statement may be safely removed without
changing the value of the basic block.
3. Renaming temporary variables
Suppose we have a statement t:= b+c, where t is a temporary. If we change this statement to
u:= b+c, where u is a new temporary variable, and change all uses of this instance of t to u,
then the value of the basic block is not changed.
4. Interchange of statements
the same children in the same order and is labeled with the same operation. Consider
computing the DAG for the following block of code.
a=b+c
c=a+x
d=b+c
b=a+x
The DAG construction is explain as follows (the movie on the right accompanies the
explanation).
1. First we construct leaves with the initial values.
2. Next we process a = b + c. This produces a node labeled + with a attached and having b0
and c0 as children.
3. Next we process c = a + x.
4. Next we process d = b + c. Although we have already computed b + c in the first statement,
the c's are not the same, so we produce a new node.
5. Then we process b = a + x. Since we have already computed a + x in statement 2, we do not
produce a new node, but instead attach b to the old node.
6. Finally, we tidy up and erase the unused initial values.
You might think that with only three computation nodes in the DAG, the block could be reduced
to three statements (dropping the computation of b). However, this is wrong. Only if b is dead
on exit can we omit the computation of b. We can, however, replace the last statement with the
simpler b = c. Sometimes a combination of techniques finds improvements that no single
technique would find. For example if a-b is computed, then both a and b are incremented by
one, and then a-b is computed again, it will not be recognized as a common subexpression even
though the value has not changed. However, when combined with various algebraic
transformations, the common value can be recognized.
Directed Acyclic Graph
Directed Acyclic Graph (DAG) is a tool that depicts the structure of basic blocks, helps to see the flow
of values flowing among the basic blocks, and offers optimization too. DAG provides easy
transformation on basic blocks. DAG can be understood here:
Interior nodes also represent the results of expressions or the identifiers/name where the values
are to be stored or assigned.
Example:
t =a+b
0
t 1 = t0 + c
d = t0 + t1
[t0 = a + b]
[t = t + c]
1 0
[d = t + t ]
0 1
Peephole Optimization
This optimization technique works locally on the source code to transform it into an optimized code. By
locally, we mean a small portion of the code block at hand. These methods can b e applied on
intermediate codes as well as on target codes. A bunch of statements is analyzed and are checked for the
following possible optimization:
• Target language : The code generator has to be aware of the nature of the target language for which
the code is to be transformed. That language may facilitate some machine-specific instructions to help
the compiler generate the code in a more convenient way. The target machine can have either CISC or
RISC processor architecture.
• IR Type : Intermediate representation has various forms. It can be in Abstract Syntax Tree (AST)
structure, Reverse Polish Notation, or 3-address code.
• Selection of instruction : The code generator takes Intermediate Representation as input and converts
(maps) it into target machine’s instruction set. One representation can have many ways (instructions)
to convert it, so it becomes the responsibility of the code generator to choose the appropriate
instructions wisely.
• Register allocation : A program has a number of values to be maintained during the execution.
The target machine’s architecture may not allow all of the values to be kept in the CPU memory
or registers. Code generator decides what values to keep in the registers. Also, it decides the
registers to be used to keep these values.
• Ordering of instructions : At last, the code generator decides the order in which the instruction will
be executed. It creates schedules for instructions to execute them.
Descriptors
The code generator has to track both the registers (for availability) and addresses (location of values) while
generating the code. For both of them, the following two descriptors are used:
• Register descriptor : Register descriptor is used to inform the code generator about the availability
of registers. Register descriptor keeps track of values stored in each register. Whenever a new
register is required during code generation, this descriptor is consulted for register availability.
• Address descriptor : Values of the names (identifiers) used in the program might be stored at
different locations while in execution. Address descriptors are used to keep track of memory
locations where the values of identifiers are stored. These locations may include CPU registers,
heaps, stacks, memory or a combination of the mentioned locations.
Code generator keeps both the descriptor updated in real-time. For a load statement, LD R1, x, the code
generator:
Code Generation
Basic blocks comprise of a sequence of three-address instructions. Code generator takes these sequence of
instructions as input.
Note : If the value of a name is found at more than one place (register, cache, or memory), the register’s
value will be preferred over the cache and main memory. Likewise cache’s value will be preferred over
the main memory. Main memory is barely given any preference.
getReg : Code generator uses getReg function to determine the status of available registers and the location of
name values. getReg works as follows:
• If variable Y is already in register R, it uses that register.
• Else if both the above options are not possible, it chooses a register that requires minimal
number of load and store instructions.
For an instruction x = y OP z, the code generator may perform the following actions. Let us assume that L
is the location (preferably register) where the output of y OP z is to be saved:
MOV y’, L
• Determine the present location of z using the same method used in step 2 for y and generate
the following instruction:
• If y and z has no further use, they can be given back to the system.
Other code constructs like loops and conditional statements are transformed into assembly language in general
assembly way.
The unnecessary jumps can be eliminated in either the intermediate code or the
target code by the following types of peephole optimizations. We can replace the jump
sequence
goto L2
….
L1 : gotoL2
by the sequence
goto L2
….
L1 : goto L2 If there are now no jumps to L1, then it may be possible to eliminate the statement
L1:goto
L2 provided it is preceded by an unconditional jump .Similarly, the sequence if
a < b goto L1
….
L1 : goto L2 can
be replaced by if a
< b goto L2 ….
L1 : goto L2
Finally, suppose there is only one jump to L1 and L1 is preceded by an unconditional goto.
Then the sequence
goto L1 ……..
L1:if a<b goto L2
L3: ...................................................... (1)
may be replaced by
if a<b goto L2 goto
L3
…….
L3: ..................................................... (2)
While the number of instructions in (1) and (2) is the same, we sometimes skip the
unconditional jump in (2), but never in (1).Thus (2) is superior to (1) in execution time
considered to be garbage when no references to that object exist. But how can we tell when no
references to an object exist? A simple expedient is to keep track in each object of the total
number of references to that object. That is, we add a special field to each object called a
reference count . The idea is that the reference count field is not accessible to the Java program.
Instead, the reference count field is updated by the Java virtual machine itself.
Consider the statement
Object p = new Integer (57);
which creates a new instance of the Integer class. Only a single variable, p, refers to the
object. Thus, its reference count should be one.
In general, every time one reference variable is assigned to another, it may be necessary to
update several reference counts. Suppose p and q are both reference variables. The
assignment
p = q;
would be implemented by the Java virtual machine as follows:
if (p != q)
{
if (p != null)
--p.refCount;
p = q;
if (p != null)
++p.refCount;
}
For example suppose p and q are initialized as follows:
Object p = new Integer (57);
Object q = new Integer (99);
As shown in Figure (a), two Integer objects are created, each with a reference count of
(b)
one. Now, suppose we assign q to p using the code sequence given above. Figure shows that
after the assignment, both p and q refer to the same object--its reference count is two. And the
reference count on Integer(57) has gone to zero which indicates that it is
garbage.
Figure: Reference counts before and after the assignment p = q.
The costs of using reference counts are twofold: First, every object requires the special
reference count field. Typically, this means an extra word of storage must be allocated in
each object. Second, every time one reference is assigned to another, the reference counts
must be adjusted as above. This increases significantly the time taken by assignment
statements.
The advantage of using reference counts is that garbage is easily identified. When it becomes
necessary to reclaim the storage from unused objects, the garbage collector needs only to
examine the reference count fields of all the objects that have been created by the program. If
the reference count is zero, the object is garbage.
It is not necessary to wait until there is insufficient memory before initiating the garbage
collection process. We can reclaim memory used by an object immediately when its
reference goes to zero. Consider what happens if we im plement the Java assignment p = q in
the Java virtual machine as follows:
if (p != q)
{
if (p != null)
if (--p.refCount == 0)
heap.release (p);
p = q;
if (p != null)
++p.refCount;
}
Notice that the release method is invoked immediately when the reference count of an object
goes to zero, i.e., when it becomes garbage. In this way, garbage may be collected
incrementally as it is created.
TEXT BOOKS:
1. Compilers, Principles Techniques and Tools - Alfred V Aho, Monical S Lam, Ravi Sethi,
Jeffrey D. Ullman,2nd ed, Pearson,2007.
2. Principles of compiler design, V. Raghavan, 2 nd ed, TMH, 2011.
3. Principles of compiler design, 2nd ed, Nandini Prasad, Elsevier
REFERENCE BOOKS:
1. http://www.nptel.iitm.ac.in/downloads/106108052 /
2. Compiler construction, Principles and Practice, Kenneth C Louden, CENGAGE
3. Implementations of Compiler, A new approach to Compilers including the algebraic
methods, Yunlinsu, SPRINGER