13-Issues in The Design of A Code Generator - 22!10!2024
13-Issues in The Design of A Code Generator - 22!10!2024
Dr. M.Bhuvaneswari
Assistant Professor Senior Grade 2
School of Computer Science and
Engineering,
Vellore Institute of Technology,
Vellore – 632014
Code Generation
• The final phase in our compiler model is the code generator.
• It takes as input the intermediate representation (IR)
produced by the front end of the compiler, along with
relevant symbol table information, and produces as output a
semantically equivalent target program
• The target program must preserve the semantic meaning of
the source program and must make effective use of the
available resources of the target machine (Eg. Register
Allocation)
Position of code
Code Generation
A code generator has three primary tasks:
1. Instruction Selection
• Instruction selection involves choosing appropriate target-machine
instructions to implement the IR statements.
3. Instruction Ordering
• Instruction ordering involves deciding in what order to schedule
the execution of instructions.
Issues in Code
Generation
Issues in Code Generation
• Issues in Code Generation are:
1. Input to code generator
2. Target program
3. Memory management
4. Instruction selection
5. Register allocation
6. Order of evaluation
1. Input to code generator
• Input to the code generator consists of the intermediate
representation of the source program.
• Types of intermediate language are:
1. Linear representation - Postfix notation
2. 3AC representation – Quadruples, triples & indirect triples
3. Graphical representation - Syntax trees or DAGs
4. VM representation – bytecode and stack-machine code
• The detection of semantic error should be done before
submitting the input to the code generator.
• The code generation phase requires complete error free
intermediate code as an input. [necessary type checking has
taken place, and that type conversion operators have been
inserted wherever necessary]
2. Target program
• The instruction-set architecture of the target machine has a significant
impact on the difficulty of constructing a good code generator that produces
high-quality machine code.
• The most common target-machine architectures are RISC (reduced
instruction set computer), CISC (complex instruction set computer), and
stack based.
2. Store operations:
• The instruction ST x, r stores the value in register r into the
location x. This instruction denotes the assignment x = r.
Cont…
• We assume the following kinds of instructions are available:
3. Computation operations
• Form: OP dst, src1, src2, where OP is a operator like ADD or SUB, and dst, src1,
and src2 are locations, not necessarily distinct.
• Eg. SUB r1, r2, r3 computes r1 = r2 - r3. Any value formerly stored in r1 is lost,
but if r1 is r2 or r3, the old value is read first.
• Unary operators that take only one operand do not have a src2.
4. Unconditional jumps:
• Form: BR L causes control to branch to the machine instruction with label L. (BR
stands for branch.)
5. Conditional jumps:
• Form: Bcond r, L, where r is a register, L is a label, and cond stands for any of the
common tests on values in the register r.
• Eg. BLTZ r, L causes a jump to label L if the value in register r is less than zero,
and allows control to pass to the next machine instruction if not.
Instruction Cost
• The address modes (to find the effective address) together with
the assembly language forms and associated cost are as
follows:
Mode Form Address Extra cost
Absolute M M 1
Register R R 0
Indexed k(R) k +contents(R) 1
Indirect *R contents(R) 0
register
Indirect *k(R) contents(k + 1
indexed contents(R))
Immediate or #C NA 1
literal
Total Cost=6
Example
Instruction Cost – Cont…
Mode Form Address Extra
cost
Absolute M M 1
Register R R 0
Indexed k(R) k +contents(R) 1
Indirect *R contents(R) 0
register
Indirect *k(R) contents(k + 1
• Calculate cost for following:
indexed contents(R))
MOV *R1 ,*R0 MOV *R1 ,*R0 cost = 1+0+0=1
ADD *R2 ,*R0 ADD *R2 ,*R0 cost = 1+0+0=1
Total Cost=2
Instruction Cost – Cont…
Mode Form Address Extra
cost
Absolute M M 1
Register R R 0
Indexed k(R) k +contents(R) 1
Indirect *R contents(R) 0
register
Indirect *k(R) contents(k + 1
• Calculate cost for following:
indexed contents(R))
ADD R2 ,R1 ADD R2 ,R1 cost = 1+0+0=1
MOV R1, a MOV R1, a cost = 1+0+1=2
Total Cost=3
Next Use Information
Computing Next Uses
• The next-use information is a collection of all the names that
are useful for next subsequent statement in a block.
• Knowing when the value of a variable will be used next is essential for
generating good code.
• If the value of a variable that is currently in a register will never be
referenced subsequently, then that register can be assigned to another
variable.
P q r s u v
2 D L L D L D
3 D D L L L L
Example
Initial condition
Solution
Register Allocation &
Assignment
Register Allocation & Assignment
• Efficient utilization of registers is important in generating good
code.
• Assign specific values in target program to certain registers
such as base address, arithmetic computations and top of
stack.
• There are four strategies for deciding what values in a program
should reside in a registers and which register each value
should reside.
• Strategies are:
1. Global Register Allocation
2. Usage Count
3. Register assignment for outer loop
4. Register allocation using graph coloring
1. Global Register Allocation
• Global register allocation strategies are:
• The global register allocation has a strategy of storing the most
frequently used variables in fixed registers throughout the
loop.
• Another strategy is to assign some fixed number of global
registers to hold the most active values in each inner loop.
(Register Assignment)
• The registers are not already allocated may be used to hold
values local to one block.
• In certain languages like C or Bliss programmer can do the
register allocation by using register declaration to keep certain
values in register for the duration of the procedure.
• Example:
{
2. Usage count
• The usage count is the count for the use of some variable x in
some register used in any basic block.
• The usage count gives the idea about how many units of cost
can be saved by selecting a specific variable for global register
allocation.
B1 B2 B3 B4 Usage count /
units of cost
a (0+ (1+ (1+ (0+ 4
2) 0) 0) 0)
b (2+ (0+ (0+ (0+ 6
0) 0) 2) 2)
c (1+ (0+ (1+ (1+ 3
0) 0) 0) 0)
d (1+ (1+ (1+ (1+ 6
2) 0) 0) 0)
e (0+ (0+ (0+ (0+ 4
2) 0) 2) 0)
f (1+ (0+ (1+ (0+ 4
0) 2) 0) 0)
R0 R1 R2
a or e or b d
f
3. Register assignment for outer loop
• Consider that there are two loops is outer loop and is an inner
loop, and register allocation of variable ‘a’ is to be done to
some register. Loop
L1
Loop L1-L2
L2
Grows in runtime