0% found this document useful (0 votes)
29 views31 pages

Codegeneration Final

The document outlines the code generation phase of a compiler, detailing its main tasks including instruction selection, register allocation, and instruction ordering. It discusses various design issues such as input requirements, target program formats, memory management, and the importance of understanding the target machine architecture. Additionally, it emphasizes the significance of efficient instruction selection and register management to optimize the generated code's performance.

Uploaded by

kumaralex772
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views31 pages

Codegeneration Final

The document outlines the code generation phase of a compiler, detailing its main tasks including instruction selection, register allocation, and instruction ordering. It discusses various design issues such as input requirements, target program formats, memory management, and the importance of understanding the target machine architecture. Additionally, it emphasizes the significance of efficient instruction selection and register management to optimize the generated code's performance.

Uploaded by

kumaralex772
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

CODE GENERATION

Introduction
• The final phase of our compiler model is code
generator.
• It takes input from the intermediate
representation with supplementary
information in symbol table of the source
program and produces as output an
equivalent target program.
• Code generator main tasks:
– Instruction selection
– Register allocation and assignment
– Instruction ordering
• Instruction selection
• choose appropriate target-machine instructions to
implement the IR statements
• Register allocation and assignment
• decide what values to keep in which registers
• Instruction ordering
• decide in what order to schedule the execution of
instructions
Issues in the design of code generator
• Input to the code generator
• Target program
• Memory management
• Instruction selection
• Register allocation
• Choice of evaluation order
• Approaches to code generation
Issues in the design of code generator
• Input to the code generator
– IR + Symbol table
– IR has several choices
• Postfix notation
• Syntax tree
• Three address code
– We assume front end produces low-level IR, i.e.
values of names in it can be directly manipulated
by the machine instructions.
– Syntactic and semantic errors have been already
detected
Issues in the design of code generator
• The target program
– The output of code generator is target program.
– Output may take variety of forms
• Absolute machine language(executable code)
• Relocatable machine language(object files for linker)
• Assembly language(facilitates debugging)
– Absolute machine language has advantage that it can be
placed in a fixed location in memory and immediately
executed.
– Relocatable machine language program allows subprograms
to be compiled separately.
– Producing assembly language program as o/p makes the
process of code generation somewhat easier.
Issues in the design of code generator
• Memory management
– Mapping names in the source program to
addresses of data objects in run time memory is
done by front end & code generator.
– If a machine code is being generated, labels in
three address statements have to be converted to
addresses of instructions. This process is
analogous to “backpatching” technique.
Issues in the design of code generator
• Instruction selection
– Uniformity and completeness of the instruction
set are imp factors
– If we do not care about the efficiency of the target
program, instruction selection is straightforward.
– The quality of the generated code is determined
by its speed and size.
– For ex,
x=y+z

LD R0, y
ADD R0, R0, z
ST x, R0
a=b+c
d=a+e

LD R0, b
ADD R0, R0, c
ST a, R0
LD R0, a
ADD R0, R0, e
ST d, R0
Issues in the design of code generator
• Register allocation
– Instructions involving register operands are usually
shorter and faster than those involving operands in
memory.
– Two subproblems
• Register allocation: select the set of variables that will
reside in registers at each point in the program
• Register assignment: select specific register that a variable
reside in
– Complications imposed by the hardware
architecture
• Example: register pairs for multiplication and division
• The multiplication instruction is of the form
M x, y
where x, is the multiplicand, is the even register
of an even/odd register pair.
• The multiplicand value is taken from the odd
register pair. The multiplier y is a single
register. The product occupies the entire
even/odd register pair.
t=a+b
t=t*c t=a+b
T=t/d t=t+c
T=t/d

L R1, a L R0, a
A R1, b A R0, b
M R0, c A R0, c
D R0, d SRDA R0, 32
ST R1, t (shift right double arithmetic)
D R0, d
ST R1, t
INC a

MOV a, R0
ADD #1,R0
MOV R0, a
Issues in the design of code generator
• Approaches to code generator
– Criterion for a code generator is to produce
correct code.
– Given the premium on correctness, designing a
code generator so it can be easily implemented,
tested, and maintained is an important design
goal.
Issues in the design of code generator
• Choice of evaluation order
– The order in which computations are performed
can affect the efficiency of the target code.
– When instructions are independent, their evaluation order
can be changed
MOV a,R0
ADD b,R0
MOV R0,t1
t1:=a+b MOV c,R1
t2:=c+d ADD d,R1
a+b-(c+d)*e MOV e,R0
t3:=e*t2
t4:=t1-t3 MUL R1,R0 MOV c,R0
MOV t1,R1 ADD d,R0
reorder SUB R0,R1 MOV e,R1
MOV R1,t4 MUL R0,R1
t2:=c+d MOV a,R0
t3:=e*t2 ADD b,R0
t1:=a+b SUB R1,R0
t4:=t1-t3 MOV R0,t4
Target machine
• Implementing code generation requires thorough understanding
of the target machine architecture and its instruction set

• Our (hypothetical) machine:


– Byte-addressable (word = 4 bytes)
– Has n general purpose registers R0, R1, …, Rn-1
– Two-address instructions of the form

op source, destination
– Op – op-code
– Source, destination – data fields
The Target Machine: Op-codes
• Op-codes (op), for example
• MOV (move content of source to destination)
• ADD (add content of source to destination)
• SUB (subtract content of source from
destination)

• There are also other ops

18
The Target Machine: Address modes
• Addressing mode: Different ways in which location of an
operand can be specified in the instruction.

Added
Mode Form Address
Cost
Absolute M M 1
Register R R 0
Indexed c(R) c+contents(R) 1

Indirect register *R contents(R) 0

Indirect indexed *c(R) contents(c+contents(R)) 1

Literal #c N/A 1
Instruction Costs
• Machine is a simple processor with fixed instruction
costs
• In most of the machines and in most of the
instructions the time taken to fetch an instruction
from memory exceeds the time spent executing the
instruction. So reducing the length of the instruction
has an extra benefit.

20
Examples

Instruction Operation Cost


MOV R0,R1 Store content(R0) into register R1 1
MOV R0,M Store content(R0) into memory location M 2
MOV M,R0 Store content(M) into register R0 2
MOV 4(R0),M Store contents(4+contents(R0)) into M 3
MOV *4(R0),M Store contents(contents(4+contents(R0))) into M 3
MOV #1,R0 Store 1 into R0 2
ADD 4(R0),*12(R1) Add contents(4+contents(R0))
to value at location contents(12+contents(R1)) 3

21
Example

Compute the cost of following set of instructions-6


MOV a, R0
ADD b, R0
MOV R0, c

MOV *R1, *R0


ADD *R2, *R0
Instruction Selection
• Instruction selection is important to obtain efficient
code
• Suppose we translate three-address code
x:= y + z
to: MOV y,R0
ADD z,R0
MOV R0,x a:=a+1 MOV a,R0
ADD #1,R0
MOV R0,a
Cost = 6
Better Best

ADD #1,a INC a


Cost = 3 Cost = 2 23
Instruction Selection: Utilizing Addressing
Modes
• Suppose we translate a:=b+c into
MOV b,R0
ADD c,R0
MOV R0,a
• Assuming addresses of a, b, and c are stored in R0,
R1, and R2
MOV *R1,*R0
ADD *R2,*R0
• Assuming R1 and R2 contain values of b and c
ADD R2,R1
MOV R1,a

24
A Code Generator
• Generates target code for a sequence of three-
address statements using next-use information
• Uses new function getreg to assign registers to
variables
• Computed results are kept in registers as long as
possible, which means:
– Result is needed in another computation
– Register is kept up to a procedure call or end of block
• Checks if operands to three-address code are
available in registers

25
The Code Generation Algorithm
• For each statement x := y op z
1. Set location L = getreg(y, z)
2. If y  L then generate
MOV y’,L
where y’ denotes one of the locations where the value
of y is available (choose register if possible)
3. Generate
OP z’,L
where z’ is one of the locations of z;
Update register/address descriptor of x to include L
4. If y and/or z has no next use and is stored in register,
update register descriptors to remove y and/or z

26
Register and Address Descriptors
• A register descriptor keeps track of what is currently
stored in a register at a particular point in the code,
e.g. a local variable, argument, global variable, etc.
MOV a,R0 “R0 contains a”
• An address descriptor keeps track of the location
where the current value of the name can be found at
run time, e.g. a register, stack location, memory
address, etc.
MOV a,R0
MOV R0,R1 “a in R0 and R1”

27
The getreg Algorithm
• To compute getreg(y,z)
1. If y is stored in a register R and R only holds the value y,
and y has no next use, then return R;
Update address descriptor: value y no longer in R
2. Else, return a new empty register if available
3. Else, find an occupied register R;
Store contents (register spill) by generating
MOV R,M
for every M in address descriptor of y;
Return register R
4. Return a memory location

28
Code Generation Example
Register Address
Statements Code Generated
Descriptor Descriptor
Registers empty
t := a - b MOV a,R0 R0 contains t t in R0
SUB b,R0

u := a - c MOV a,R1 R0 contains t t in R0


SUB c,R1 R1 contains u u in R1

v := t + u ADD R1,R0 R0 contains v u in R1


R1 contains u v in R0
d := v + u ADD R1,R0 R0 contains d d in R0
MOV R0,d d in R0 and
memory

29
Register allocation and assignment
• Values in registers are easier and faster to access
than memory
– Reserve a few registers for stack pointers, base
addresses etc
– Efficiently utilize the rest of general-purpose registers
• Register allocation
– At each program point, select a set of values to reside
in registers
• Register assignment
– Pick a specific register for each value, subject to
hardware constraints
– Register classes: not all registers are equal
• One approach to register allocation and
assignment is to assign specific values in a
object program to certain registers.
• Advantage of this method is that design of
compiler will become easy.
• Disadvantage is that, registers will be used
inefficiently, certain registers will go unused
and unnecessary loads and stores are
generated.

You might also like