0% found this document useful (0 votes)
20 views85 pages

Chapter 7

Chapter Seven discusses the code generation phase of a compiler, detailing its position, tasks, and design issues. It covers the input to the code generator, the target program architecture, instruction selection, register allocation, and evaluation order. Additionally, it introduces a simple target machine model and addresses program and instruction costs.

Uploaded by

alazarjesus4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views85 pages

Chapter 7

Chapter Seven discusses the code generation phase of a compiler, detailing its position, tasks, and design issues. It covers the input to the code generator, the target program architecture, instruction selection, register allocation, and evaluation order. Additionally, it introduces a simple target machine model and addresses program and instruction costs.

Uploaded by

alazarjesus4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 85

CHAPTER SEVEN

Code Generation

1
Outline
– Position of code generator
– Code Generation
– Issue in the Design of a Code Generator
– Input to the Code Generator
– The Target Program
– Instruction Selection
– Register Allocation
– Choice of Evaluation Order
– A Simple Target Machine Model
– Program and Instruction Costs
2
Outline

– A simple code generator


– Peephole optimization

3
Position of a Code Generator
Source Target
Program IC IC Program
Front-End Code Code
Optimizer Generator

Lexical error
Syntax error
Semantic error Symbol
Table

4
Code Generation
• The final phase in our compiler model is code generator.
• It takes as input the intermediate representation (IR)
produced by the front end of the compiler, along with
relevant symbol table information, and produces as output a
semantically equivalent target program.

• Requirements imposed on a code generator


– Preserving the semantic meaning of the source program and being
of high quality
– Making effective use of the available resources of the target
machine
– The code generator itself must run efficiently.
• A code generator has three primary tasks:
– Instruction selection, register allocation and assignment, and
instruction ordering
5
Issue in the Design of a Code
Generator
• General tasks in almost all code generators:
– instruction selection,
– register allocation and assignment and
– instruction ordering
• The details are also dependent on:
– the specifics of the intermediate representation,
– the target language, and
– the run-time system.

• The most important criterion for a code


generator is that it should produce correct code.
6
Issue in the Design of a Code Generator

• Input to the Code Generator


• The Target Program
• Instruction Selection
• Register Allocation
• Choice of Evaluation Order

7
Input to the Code Generator
• The input to the code generator is
– the intermediate representation of the source program produced
by the frontend along with
– information in the symbol table that is used to determine the run-
time address of the data objects denoted by the names in the IR.
• Choices for the IR
– Three-address representations: quadruples, triples, indirect triples
– Virtual machine representations: such as byte codes and stack-
machine code
– Linear representations: such as postfix notation
– Graphical representation: such as syntax trees and DAG’s
• Assumptions
– Front end has scanned, parsed and translated into relatively lower level IR
– All syntactic and static semantic errors are detected.

8
The Target Program
• The most common target-machine architectures
are RISC, CISC, and stack based.

– A RISC machine typically has many registers, three-address


instructions, simple addressing modes, and a relatively
simple instruction-set architecture.
– A CISC machine typically has few registers, two-address
instructions, and variety of addressing modes, several
register classes, variable-length instructions.

– In a stack-based machine, operations are done by pushing


operands onto a stack and then performing the operations
on the operands at the top of the stack.

9
The Target Program…
• Producing the target program as
– Absolute machine code (exécutable code)
– Relocatable machine code (Object files for linker and
loader)
– Assembly language (assembler)
– Byte code forms for interpreters (e.g. JVM)

• In this chapter
– Use very simple RISC-like computer as the target
machine.
– Add some CISC-like addressing modes
– Use assembly code as the target language.
10
Instruction Selection
• The code generator must map the IR program
into a code sequence that can be executed by
the target machine.
• The complexity of the mapping is determined by
factors such as:

– The level of the IR


– The nature of the instruction-set architecture
– The desired quality of the generated code

11
Instruction Selection…
• If the IR is high level, use code templates to
translate each IR statement into a sequence of
machine instruction.
– Produces poor code, needs further optimization.

• If the IR reflects some of the low-level details of


the underlying machine, then it can use this
information to generate more efficient code
sequence.

12
Instruction Selection…
• The nature of the instruction set of the target machine has a strong
effect on the difficulty of instruction selection. For example,
– The uniformity and completeness of the instruction set are important
factors.
– Instruction speeds are another important factor.
• If we do not care about the efficiency of the target program, instruction
selection is straightforward.
x = y + z  LD R0, y
ADD R0, R0, z
ST x, R0

a = b + c  LD R0, b
d = a + e ADD R0, R0, c
ST a, R0 Redundant
LD R0, a
ADD R0, R0,e
ST d, R0 13
Instruction Selection…
• Example: consider the following statement:
x := x + 1
• Use ADD instruction (straight forward)
– costly
• Use INC instruction
– Less costly

• Straight forward translation may not always be the


best one, which leads to unacceptably inefficient
target code.

14
Instruction Selection…
• Suppose we translate three-address code:
x: = y + z LD R0 , y
ADD R0 , z
ST x , R0

x: = x + 1 LD R0 , x
ADD R0 , #1 cost = 6
ST x , R0

Better ADD x , #1 cost = 3

Best INC x cost = 2

15
Register Allocation
• Efficient and careful management of registers results in a faster
program.
• A key problem in code generation is deciding what values to hold in
what registers.
– Use of registers imposes two problems:
• Register allocation: select the variables that will reside
in
registers.
• Register assignment: pick the register that a variable
will
reside in.
• Finding an optimal assignment of registers to variables
is mathematically difficult.
• In addition, the hardware/OS may require some register
usage rules to be followed.
16
Register Allocation…
t := a * b t := a * b
t := t + a t := t + a
t := t / d t := t / d

{R1 = t} {R0=a, R1=t}

LD R1 , a LD R0 , a
MUL R1 , b LD R1 , R0
ADD R1 , a MUL R1 , b
DIV R1 , d ADD R1 , R0
ST t , R1 DIV R1 , d
ST t , R1

17
Choice of Evaluation Order
• The order in which computations are performed can
affect the efficiency of the target code.

• Some computation orders require fewer registers to hold


intermediate results than others.

• However, Selection of the best evaluation order is also


mathematically difficult.

18
Choice of Evaluation Order…
• When instructions are independent, their evaluation order
can be changed.
LD R0 , a
ADD R0 , b
t1 := a + b ST t1 , R0
a + b – (c + d) * e t2 := c + d LD R1 , c
t3 := e * t2 ADD R1 , d
t4 := t1 – t3 LD R0 , e LD R0 , c
MUL R0 , R1 ADD R0 , d
Reorder LD R1 , t1 LD R1, e
SUB R1 , R0 MUL R1, R0
ST t4 , R1 LD R0 , a
t2 := c + d
ADD R0 , b
t3 := e * t2
SUB R0 , R1
t1 := a + b
ST t4 , R0
t4 := t1 – t3
19
A Simple Target Machine Model
• Implementing code generation requires complete
understanding of the target machine architecture and its
instruction set.

• Our (hypothetical) machine:


– Byte-addressable (word = 4 bytes)
– Has n general purpose registers R0, R1, …, Rn-1
– All operands are integers
– Three-address instructions of the form op dest, src1, src2
• Assume the following kinds of instructions are available:
– Load operations
– Store operations
– Computation operations
– Unconditional jumps
– Conditional jumps
20
Load operations
• The instruction LD dst, addr loads the value in
location addr into location dst.

• This instruction denotes the assignment dst = addr.


• The most common form of this instruction is LD r, x
which loads the value in location x into register r.

• An instruction of the form LD r1, r2 is a register-to-


register copy in which the contents of register r2 are
copied into register r1.

21
Store operations

• The instruction ST x, r
• stores the value in register r into the location x.

• This instruction denotes the assignment x = r.

22
Computation operations
• Has the form OP dst, src1, src2 ,
• where OP is an operator like ADD or SUB, and
dst, src1, src2 are locations, not necessarily distinct.

• The effect of this machine instruction is to apply the operation


represented by OP to the values in locations src1 and src2, and
place the result of this operation in location dst.
• For example, SUB r1 , r2 , r3 computes r1 = r2 – r3 Any value
formerly stored in r1 is lost,
• but if r1 is r2 or r3 the old value is read first.
• Unary operators that take only one operand do not have a
src2.
23
Unconditional Jumps
• The instruction
BR L

causes control to branch to the machine


instruction with label L.

(BR stands for branch.)

24
Conditional Jumps
• Has the form Bcond r, L,
where: r is a register,
L is a label, and
cond is any of the common tests on values in the
register r.

• For example:
BLTZ r, L
causes a jump to label L if the value in register r is less
than zero, and allows control to pass to the next machine
instruction
if not.
25
The Target Machine: Addressing
Modes
• We assume that our target machine has a variety of
addressing modes:
– In instructions, a location can be a variable name x referring to
the memory location that is reserved for x.
– Indexed address, a(r), where a is a variable and r is a register.

LD R1, a(R2) R l = contents (a + contents ( R2))


• This addressing mode is useful for accessing arrays.

– A memory location can be an integer indexed by a register, for


example,
LD R1, 100(R2) R1 = contents(100 + contents(R2))
• useful for following pointers
26
The Target Machine: Addressing
Modes
– Two indirect addressing modes: *r and *100(r)
LD R1, *100 (R2) R1 = contents(contents(l00 + contents(R2)))
• Loading into R1 the value in the memory location stored in the memory
location obtained by adding 100 to the contents of register R2.

• Immediate constant addressing mode. The constant is


prefixed by #.
The instruction LD R1, #100 loads the integer 100 into register R1,
and ADD R1, R1, #100 adds the integer 100 into register R1.
R1 = R1 + 100
• Comments at the end of instructions are preceded by //.

27
The Target Machine: Addressing
Modes
• Op-codes (op), for example
LD and ST (move content of source to destination)
ADD (add content of source to destination)
SUB (subtract content of source from dest.)

Address
modes
Mode Form Address Added Cost
Absolute M M 1
Register R R 0
Indexed a(R) a + contents (R) 1
Indirect Register *R contents (R) 0
Indirect Indexed *a(R) contents(a + contents (R)) 1
Literal #c c 1

28
A Simple Target language (assembly
language)
• Example :
x = y – z  LD R1, y //R1=y
LD R2, z // R2=z
SUB R1, R1, R2 //R1=R1-R2
ST x, R1 //x=R1
b = a[i] LD R1, i // R1=i
MUL R1, R1, 8 // R1=R1*8
LD R2, a(R1) //
R2=content(a+content(R1))
ST b, R2 // b=R2
a[j] = c LD R1, c
 //R1=c
LD R2, j //R2=j
MUL R2, R2, 8 //R2=R2*8
ST a(R2), R1 //
content(a+content(R2))=R1
29
A Simple Target language (assembly
language)
x = *p  LD R1, p //R1=p
LD R2, 0(R1) //R2=content(0+content(R1))
ST x, R2 //x=R2

*p = y  LD R1, p //R1=p
LD R2, y //R2=y
ST 0(R1), R2 //content(0+content(R1))=R2

if x < y goto L  LD R1, x //R1=x


LD R2, y //R2=y
SUB R1, R1, R2 //R1=R1-R2
BLTZ R1, L //if R1 < 0 jump to L

30
Program and Instruction Costs
• Cost is associated with compiling and running a program.
• Cost measures are:
– The length of compilation time and the size
– Running time and power consumption of the target program

• For simplicity, we take:


the cost of an instruction = one + the costs associated
with the addressing modes of the operands.

• Addressing modes:
– involving registers have zero additional cost,
– involving a memory location or constant in them have an
additional cost of one.

31
Examples
Instruction Operation Cost
LD R1 , R0 Load content(R0) into register R1 1
ST M , R0 Store content(R0) into memory location 2
M
LD R0 , M Load content(M) into register R0 2
ST M , 4(R0) Store contents(4+contents(R0)) into M 3
ST M , *4(R0) Store contents(contents(4+contents(R0))) 3
into M
LD R0, #1 Load 1 into R0 2
ADD *12(R1) , Add contents(4+contents(R0)) to value at 3
4(R0) location contents(12+contents(R1))

32
Exercises
• Exercise 8.2.1
• Exercise 8.2.2
• Exercise 8.2.1
• Exercise 8.2.3
• Exercise 8.2.4
• Exercise 8.2.5
• Exercise 8.2.6

33
Basic Blocks and Flow Graphs
• Introduce a graph representation of intermediate
code that is helpful for discussing code generation;

• Useful for:
– Register allocation and Instruction selection
– Local and global optimization

• The representation is constructed as follows:


– Partition the intermediate code into basic blocks
– The basic blocks become the nodes of a flow graph,
whose edges indicate which blocks can follow which
other blocks.
34
Basic Blocks
• Definition
– A basic block is a maximal sequence of
consecutive statements with a single entry
point, a single exit point, and no internal
branches

• Basic unit in control flow analysis


• Local level of code optimizations
– Redundancy elimination
– Register-allocation
– Dead code elimination ...
35
Basic Block Example
1) i := m – 1
2) j := n • How many basic
3) t1 := 4 * n
blocks in this code
4) v := a[t1]
5) i := i + 1 fragment?
6) t2 := 4 * I • What are they?
7) t3 := a[t2]
8) if t3 < v goto 5)
9) j := j – 1
10) t4 := 4 * j
11) t5 := a[t4]
12) If t5 > v goto 9)
13) if i >= j goto 23)
14) t6 := 4*I
15) x := a[t6]

36
Basic Block Example
1) i := m – 1
2) j := n • How many basic blocks
3) t1 := 4 * n
in this code fragment?
4) v := a[t1]
5) i := i + 1
• What are they?
6) t2 := 4 * I
7) t3 := a[t2]
8) if t3 < v goto 5)
9) j := j – 1
10) t4 := 4 * j
11) t5 := a[t4]
12) If t5 > v goto 9)
13) if i >= j goto 23)
14) t6 := 4*I
15) x := a[t6]

37
Identify Basic Blocks
Input: A sequence of intermediate code
statements
1. Determine the leaders, the first statements
of basic blocks
• The first statement in the sequence (entry point)
is a leader
• Any statement that is the target of a branch
(conditional or unconditional) is a leader
• Any statement immediately following a branch
(conditional or unconditional) or a return is a
leader
2. For each leader, its basic block is the leader
and all statements up to, but not including,
the next leader or the end of the program 38
Basic Block
Partitioning Algorithm
1. Identify leader statements (i.e. the first statements
of basic blocks) by using the following rules:

(i) The first statement in the program is a leader

(ii) Any statement that is the target of a branch


statement is a leader (for most intermediate
languages these are statements with
an associated label)

(iii) Any statement that immediately follows a branch


or return statement is a leader
39
Example: Finding Leaders
The following code computes the inner product of two vectors.

(1) prod := 0
(2) i := 1
begin
(3) t1 := 4 * i
prod := 0;
(4) t2 := a[t1]
i := 1;
(5) t3 := 4 * i
do begin
(6) t4 := b[t3]
prod := prod + a[i] * b[i];
(7) t5 := t2 * t4
i = i+ 1;
(8) t6 := prod + t5
end
(9) prod := t6
while i <= 20
(10) t7 := i + 1
end
(11) i := t7
Source code (12) if i <= 20 goto (3)
Three-address code
40
Example: Finding Leaders
The following code computes the inner product of two vectors.

Rule (i) (1) prod := 0


(2) i := 1
begin
(3) t1 := 4 * i
prod := 0;
(4) t2 := a[t1]
i := 1;
(5) t3 := 4 * i
do begin
(6) t4 := b[t3]
prod := prod + a[i] * b[i];
(7) t5 := t2 * t4
i = i+ 1;
(8) t6 := prod + t5
end
(9) prod := t6
while i <= 20
(10) t7 := i + 1
end
(11) i := t7
Source code (12) if i <= 20 goto (3)
(13) …
Three-address code
41
Example: Finding Leaders
The following code computes the inner product of two vectors.

Rule (i) (1) prod := 0


(2) i := 1
begin
prod := 0; Rule (ii) (3) t1 := 4 * i
(4) t2 := a[t1]
i := 1;
(5) t3 := 4 * i
do begin
(6) t4 := b[t3]
prod := prod + a[i] * b[i];
(7) t5 := t2 * t4
i = i+ 1;
(8) t6 := prod + t5
end
(9) prod := t6
while i <= 20
(10) t7 := i + 1
end
(11) i := t7
Source code (12) if i <= 20 goto (3)
(13) …
Three-address code
42
Example: Finding Leaders
The following code computes the inner product of two vectors.

Rule (i) (1) prod := 0


(2) i := 1
begin
prod := 0; Rule (ii) (3) t1 := 4 * i
(4) t2 := a[t1]
i := 1;
(5) t3 := 4 * i
do begin
(6) t4 := b[t3]
prod := prod + a[i] * b[i];
(7) t5 := t2 * t4
i = i+ 1;
(8) t6 := prod + t5
end
(9) prod := t6
while i <= 20
(10) t7 := i + 1
end
(11) i := t7
Source code (12) if i <= 20 goto (3)
Rule (iii) (13) …
Three-address code
43
Forming the Basic Blocks
Now that we know the leaders, how do we form
the basic blocks associated with each leader?

2. The basic block corresponding to a leader


consists of the leader, plus all statements up to
but not including the next leader or up to the
end of the program.

44
Example: Forming the Basic Blocks

B1 (1) prod := 0
(2) i := 1

B2 (3) t1 := 4 * i
(4) t2 := a[t1]
Basic Blocks: (5) t3 := 4 * i
(6) t4 := b[t3]
(7) t5 := t2 * t4
(8) t6 := prod + t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)

B3 (13) …
45
Example (More)
(1) i := m – 1 (16) t7 := 4 * i
(2) j := n (17) t8 := 4 * j
(3) t1 := 4 * n (18) t9 := a[t8]
(4) v := a[t1] (19) a[t7] := t9
(5) i := i + 1 (20) t10 := 4 * j
(6) t2 := 4 * i (21) a[t10] := x
(7) t3 := a[t2] (22) goto (5)
(8) if t3 < v goto (5) (23) t11 := 4 * i
(9) j := j - 1 (24) x := a[t11]
(10) t4 := 4 * j (25) t12 := 4 * i
(11) t5 := a[t4] (26) t13 := 4 * n
(12) If t5 > v goto (9) (27) t14 := a[t13]
(13) if i >= j goto (23) (28) a[t12] := t14
(14) t6 := 4*i (29) t15 := 4 * n
(15) x := a[t6] (30) a[t15] := x

46
Example: Leaders
(1) i := m – 1 (16) t7 := 4 * i
(2) j := n (17) t8 := 4 * j
(3) t1 := 4 * n (18) t9 := a[t8]
(4) v := a[t1] (19) a[t7] := t9
(5) i := i + 1 (20) t10 := 4 * j
(6) t2 := 4 * i (21) a[t10] := x
(7) t3 := a[t2] (22) goto (5)
(8) if t3 < v goto (5) (23) t11 := 4 * i
(9) j := j - 1 (24) x := a[t11]
(10) t4 := 4 * j (25) t12 := 4 * i
(11) t5 := a[t4] (26) t13 := 4 * n
(12) If t5 > v goto (9) (27) t14 := a[t13]
(13) if i >= j goto (23) (28) a[t12] := t14
(14) t6 := 4*i (29) t15 := 4 * n
(15) x := a[t6] (30) a[t15] := x

47
Example: Basic Blocks
(1) i := m – 1 (16) t7 := 4 * i
(2) j := n (17) t8 := 4 * j
(3) t1 := 4 * n (18) t9 := a[t8]
(4) v := a[t1] (19) a[t7] := t9
(5) i := i + 1 (20) t10 := 4 * j
(6) t2 := 4 * i (21) a[t10] := x
(7) t3 := a[t2] (22) goto (5)
(8) if t3 < v goto (5) (23) t11 := 4 * i
(9) j := j - 1 (24) x := a[t11]
(10) t4 := 4 * j (25) t12 := 4 * i
(11) t5 := a[t4] (26) t13 := 4 * n
(12) If t5 > v goto (9) (27) t14 := a[t13]
(13) if i >= j goto (23) (28) a[t12] := t14
(14) t6 := 4*i (29) t15 := 4 * n
(15) x := a[t6] (30) a[t15] := x

48
Control Flow Graph (CFG)

A control flow graph (CFG), or simply a flow graph,


is a directed multigraph in which:
(i) the nodes are basic blocks; and
(ii) the edges represent flow of control
(branches or fall-through execution).

The basic block whose leader is the first intermediate


language statement is called the start node.

In a CFG we have no information about the data.


Therefore an edge in the CFG means that the
program may take that path.

49
Control Flow Graph (CFG)

There is a directed edge from basic block B1 to basic


block B2 in the CFG if:

(1) There is a branch from the last statement of B1 to the


first statement of B2, or
(2) Control flow can fall through from B1 to B2 because:
(i) B2 immediately follows B1, and
(ii) B1 does not end with an unconditional branch

50
Example: Control Flow Graph Formation

B1 (1) prod := 0
(2) i := 1
Rule (2)
B2 (3) t1 := 4 * i
B1 (4) t2 := a[t1]
(5) t3 := 4 * i
B2 (6) t4 := b[t3]
(7) t5 := t2 * t4
B3 (8) t6 := prod + t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)

B3 (13) … 51
Example : Control Flow Graph
Formation

B1 (1) prod := 0
(2) i := 1 Rule (1)
Rule (2)
B2 (3) t1 := 4 * i
B1 (4) t2 := a[t1]
(5) t3 := 4 * i
B2 (6) t4 := b[t3]
(7) t5 := t2 * t4
B3 (8) t6 := prod + t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)

B3 (13) … 52
Example : Control Flow Graph
Formation

B1 (1) prod := 0
(2) i := 1 Rule (1)
Rule (2)
B2 (3) t1 := 4 * i
B1 (4) t2 := a[t1]
(5) t3 := 4 * i
B2 (6) t4 := b[t3]
(7) t5 := t2 * t4
B3 (8) t6 := prod + t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)
Rule (2)
B3 (13) … 53
Example: CFG (More)
(1) i := m – 1 (16) t7 := 4 * i
(2) j := n (17) t8 := 4 * j
(3) t1 := 4 * n (18) t9 := a[t8]
(4) v := a[t1] (19) a[t7] := t9
(5) i := i + 1 (20) t10 := 4 * j
(6) t2 := 4 * i (21) a[t10] := x
(7) t3 := a[t2] (22) goto (5)
(8) if t3 < v goto (5) (23) t11 := 4 * i
(9) j := j - 1 (24) x := a[t11]
(10) t4 := 4 * j (25) t12 := 4 * i
(11) t5 := a[t4] (26) t13 := 4 * n
(12) If t5 > v goto (9) (27) t14 := a[t13]
(13) if i >= j goto (23) (28) a[t12] := t14
(14) t6 := 4*i (29) t15 := 4 * n
(15) x := a[t6] (30) a[t15] := x

54
A Simple Code Generator

• In this section, we shall consider an algorithm


that generates code for a single basic block.

• It considers each three-address instruction in


turn, and keeps track of what values are in what
registers.

• So it can avoid generating unnecessary loads


and stores.

55
A Simple Code Generator
• One of the primary issues during code generation:
deciding how to use registers to best advantage
• Four principal uses of registers:
– In most machine architectures, some or all of the operands
of an operation must be in registers in order to perform the
operation.
– Registers make good temporaries to hold the result of a
sub expression or a variable that is used only within a
single basic block.
– Registers are used to hold (global) values that are
computed in one basic block and used in other blocks.
– Registers are often used to help with run-time storage
management (e.g., stack-pointer).

56
A Simple Code Generator
• Assumption of the code-generation algorithm in this
section:
– Some set of registers is available to hold the values that are
used within the block.
– The basic block has already been transformed into a
preferred sequence of three-address instructions
– For each operator, there is exactly one machine instruction
that takes the necessary operands in registers and
performs that operation, leaving the result in a register as
long as possible until:
a) Their register is needed for another computation
• LDcall,
b) A procedure reg,jump
memor labeled statements
• ST mem, reg
• OP reg, reg, reg
57
Register and Address Descriptors
• Descriptors are used by the code generating alg. To keep
track of register contents and addresses for the names.
• Descriptors are necessary for variable load and store
decision.
• Register descriptor MOV R0, a “R0 contains
– For each available register a”
– Keeping track of the variable names whose current value is in
that register, it is consulted when a new register is needed.
– Initially, all register descriptors are empty
• Address descriptor MOV R0, a
– For each program variable
MOV R1,R0 “a in R0 and
R1”
– Keeping track of the location (s) where the current value of that
variable can be found (register, memory address, stack location)
– Stored in the symbol-table entry for that variable name.
58
The Code-Generation Algorithm
• There are basically three parts to (this simple algorithm
for) code generation.
• Choosing registers
• Generating instructions
• Managing descriptors

• We will isolate register allocation in a function


getReg(Ins.), which is presented later.
• First presented is the algorithm to generate instructions.
• This algorithm uses getReg() and the descriptors.
• Then we learn how to manage the descriptors and finally
we study getReg() itself.
59
The Code-Generation Algorithm
• Function getReg(I)
– Selecting registers for each memory location associated with
the three-address instruction I.
• Machine Instructions for Operations
– For a three-address instruction such as x = y + z, do the
following:
1. Use getReg(x = y + z) to select registers for x, y, and z. Call
these Rx, Ry, and Rz .
2 . If y is not in Ry (according to the register descriptor for Ry) ,
then issue an instruction
LD Ry , y' , where y' is one of the memory locations for y
(according to the address descriptor for y) .
3. Similarly, if z is not in Rz , issue an instruction LD Rz, z’ , where
z’ is a location for z.
4. Issue the instruction ADD Rx , Ry , Rz.
60
The Code-Generation Algorithm
• Machine Instructions for Copy Statements
– For x=y, getReg will always choose the same register for
both x and y.
– If y is not in that register Ry , generate instruction LD Ry ,
y.
– If y was in Ry , do nothing.
– Need to adjust the register description for Ry so that it
includes x as one of the values.

61
The Code-Generation Algorithm
• Ending the Basic Block
• We need to ensure that all variables needed by (dynamically)
subsequent blocks (i.e., those live-on-exit) have their current
values in their memory locations.

– Temporaries are never live beyond a basic block so can be


ignored.
– Variables dead-on-exit (thank you global flow analysis for
determining such variables) are also ignored.
– All live-on-exit variables (for all non-temporaries) need to be in
their memory location on exit from the block.
– Therefore, for any live-on-exit variable whose own memory
location is not listed in its address descriptor,

• Generate the instruction ST x, R, where R is a register in which x's


value exists at the end of the block if x is live-on-exit from the block.
62
The Code-Generation Algorithm
• Managing Register and Address Descriptors
1 . For the instruction LD R, x
(a) Change the register descriptor for register R so it holds only x.
(b) Change the address descriptor for x by adding register R as an

additional location of x.
2. For the instruction ST x, R
(a) Change the address descriptor for x to include its own location.
3. For an operation such as ADD Rx , Ry , Rz for x = y +
z
(a) Change the register descriptor for Rx so that it holds only x.
(b) Change the address descriptor for x so that its only location is Rx.
– Note that the memory location for x is not now in the address
descriptor for x.
(c) Remove Rx from the address descriptor of any variable other than x.
63
The Code-Generation Algorithm…

Managing Register and Address Descriptors…

4. When process a copy statement x = y, after


generating the load for y into register Ry, if needed,
and after managing descriptors as for all load
statements (per rule 1 ) :

(a) Add x to the register descriptor for Ry.


(b) Change the address descriptor for x so that its only
location is Ry

64
The Code-Generation Algorithm
• Managing Register and Address Descriptors
• For R a register, let Desc(R) be its register descriptor. For x a
program variable, let Desc(x) be its address descriptor.
1. Load: LD R, x
– Desc(R) = x (removing everything else from Desc(R))
– Add R to Desc(x) (leaving alone everything else in Desc(x))
– Remove R from Desc(w) for all w ≠ x
2. Store: ST x, R
– Add the memory location of x to Desc(x)
3. Operation: OP Rx, Ry, Rz implementing the quad OP x, y, z
– Desc(Rx) = x
– Desc(x) = Rx (Now Rx does not contain x's memory location!)
– Remove Rx from Desc(w) for all w ≠ x
4. Copy: For x = y after processing the load (if needed)
– Add x to Desc(Ry) (recall that Ry=Rx).
– Desc(x) = Ry.
65
Example
• Since we haven't specified getReg() yet, we will assume there
are an unlimited number of registers so we do not need to
generate any spill code (saving the register's value in memory).
• One of getReg()'s jobs is to generate spill code when a register
needs to be used for another purpose and the current value is
not presently in memory.
• Despite having ample registers and thus not generating spill
code, we will not be wasteful of registers.
– When a register holds a temporary value and there are no
subsequent uses of this value, we reuse that register.
– When a register holds the value of a program variable and there
are no subsequent uses of this value, we reuse that register
providing this value is also in the memory location for the variable.
– When a register holds the value of a program variable and all
subsequent uses of this value are preceded by a redefinition, we
could reuse this register. But to know about all subsequent uses
may require live/dead-on-exit knowledge.
66
t=a-b
u=a-c
v=t+u
a=d
d=v+u

Assume:
t,u,v are
temporaries
a,b,c,d are
variables live
on exit 67
Design of the Function getReg
• Pick a register Ry for y in x=y+z
1 . If y is currently in a register, pick the register.
2. If y is not in a register, but there is an empty register, pick the
register.
3. If y is not in a register, and there is no empty register.
• Let R be a candidate register, and suppose v is one of the variables in
the register descriptor
• need to make sure that v's value either is not needed, or that there is
somewhere else we can go to get the value of R.
(a) OK if the address descriptor for v says that v is somewhere besides R,
(b) OK if v is x, and x is not one of the other operands of the
instruction(z in this example)
(c) OK if v is not used later
(d) Generate the store instruction ST v, R to place a copy of v in its own
memory location. This operation is called a spill.

68
Design of the Function getReg

• Pick a register Rx for x in x=y+z


– Almost as for y, differences:

1.Since a new value of x is being computed, a


register that holds only x is a choice for Rx;

2. If y is not used after the instruction, and Ry


holds only y after being loaded, then Ry can be
used as Rx;
A similar option holds regarding z and Rz·
69
Exercises:
• Exercise 8.6.1
• Exercise 8.6.3

70
Peephole “Optimization”
• The peephole is a small, sliding window on a program.

• Peephole optimization, is done by examining a sliding


window of target instructions and replacing instruction
sequences within the peephole by a shorter or faster
sequence, whenever possible.

• Peephole optimization can be applied directly after


intermediate code generation to improve the
intermediate representation.

71
Peephole “Optimization”

Goals:
- improve performance
- reduce memory footprint
- reduce code size
Method:
1. Exam short sequences of target instructions
2. Replacing the sequence by a more efficient one.
• redundant-instruction elimination
• flow-of-control optimizations
• algebraic simplifications
• use of machine idioms
72
Eliminating Redundant Load and
Stores

(1) LOAD R0, a


(1) LOAD R0, a
(2) STORE a, R0

 The second instruction can be deleted, Why?


 The first instruction will ensure that the value of a has
already
been loaded in to R0.
 but only if it is not labeled with a target label

73
 An unlabeled instruction Eliminating
immediately following an
Unreachable code
unconditional jump may
be removed.
This operation can be
repeated to eliminate a debug = 0
sequence of instructions. ...
if(debug) {
Source Code: print debugging information
}

debug = 0
...
Intermediate if debug = 1 goto L1
Code: goto L2
L1: print debugging information
L2:
74
Eliminate Jump after Jump
One obvious
peephole
optimization is to debug = 0
eliminate jumps over ...
jumps. Before: if debug = 1 goto L1
goto L2
L1: print debugging information
L2:

debug = 0
...
After: if debug  1 goto L2
print debugging information
L2:

75
Constant Propagation

If debug is set to
0 at the debug = 0
beginning of the Before: ...
program, if debug  1 goto L2
constant print debugging information
propagation L2:
would transform
this sequence debug = 0
into ...
After: if 0  1 goto L2
print debugging information
L2:
76
Deleting Unreachable Code
(dead code elimination)
 Now the
argument of the
first statement
debug = 0
always evaluates
Before: ...
to true,
so the if 0  1 goto L2
print debugging information
statement can be
L2:
replaced by goto
L2.

Then all After: goto L2


L2:
statements that
print debugging
information are
unreachable and 77
Flow-of-control optimizations
Unnecessary jumps can be eliminated in either the
intermediate code or the target code by peephole
optimizations. If there are now no jumps to
goto L1 goto L2
... ... L1, then it may be possible
to eliminate the statement
L1: goto L2 L1: goto L2 L1

if a < b goto L1 if a < b goto L2


... ...
L1: goto L2 L1: goto L2

goto L1 if a < b goto L2


... goto L3
L1: if a < b goto L2 ...
L3: L3:
78
Flow-of-control optimizations
Shorten chain of branches by modifying target labels

if a==0 goto L2 if a==0 goto L3

b:= x + y
b:= x + y

L2: goto L3 L2: goto L3

79
Transformations on Basic Blocks

• Structure-Preserving Transformations:

– common subexpression elimination


– dead code elimination
– renaming of temporary variables
– interchange of two independent adjacent
statements

80
Algebraic Simplification and Reduction in
Strength
• Algebraic simplification can be used to eliminate three-
address statements
x = x+0 can be eliminated to …
x=x*1
• Reduction-in-strength transformations can be applied to
replace expensive operations by a cheaper one
– x2 ; power(x, 2); x*x
– 2*x x+x
– Fixed-point multiplication or division; shift
– Floating-point division by a constant can be
approximated as multiplication by a constant
81
Examples of Transformations

Algebraic transformations:
x := x + 0
x := x * 1 x := y*y
x := y**2 z := x + x
z := 2*x

Changes such as y**2 into y*y and 2*x into x+x


are also known as strength reduction.

82
Examples of Transformations
Common subexpression elimination:
remove redundant computations
a := b + c a := b + c
b := a - d b := a - d
c := b + c c := b + c
d := a - d d := b

t1 := b*c t1 := b * c
t2 := a – t1 t2 := a – t1
t3 := b*c t4 := t2 + t1
t4 := t2 + t3

Dead code elimination:

if x is never referenced after the statement x = y+z, the


statement can be safely eliminated.
83
Examples of Transformations
Interchange of statements:
t1 := b + c t2 := x + y
t2 := x + y t1 := b + c

Renaming temporary variables:

 if there is a statement t := b + c, we can change it to u := b + c


and change all uses of t to u.
 Temporary variables that are dead at the end of a block can be
safely renamed

t1 := b + c t1 := b + c
t2 := a – t1 t2 := a – t1
t1 := t1 * d t3 := t1 * d
d := t2 + t1 d := t2 + t3
84
Use of Machine Idioms
• The target machine may have hardware instructions to
implement certain specific operations efficiently.
• Using these instructions can reduce execution time
significantly.

• Example:
– some machines have auto-increment and auto-decrement
addressing modes.
– The use of the modes greatly improves the quality of code
when pushing or popping a stack as in parameter passing.
– These modes can also be used in code for statements like
x=x+1 inc x

85

You might also like