0% found this document useful (0 votes)
4 views123 pages

TSR - Class Cd-Unit 4

The document outlines the course content for COMPILER DESIGN, detailing the various units covering topics such as lexical analysis, syntax analysis, intermediate code generation, and code generation. Each unit includes specific subtopics and methodologies relevant to compiler construction and design. The course is taught by Dr. T. Saju Raj and is a core program with a total of 3 credits.

Uploaded by

amjayasoorya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views123 pages

TSR - Class Cd-Unit 4

The document outlines the course content for COMPILER DESIGN, detailing the various units covering topics such as lexical analysis, syntax analysis, intermediate code generation, and code generation. Each unit includes specific subtopics and methodologies relevant to compiler construction and design. The course is taught by Dr. T. Saju Raj and is a core program with a total of 3 credits.

Uploaded by

amjayasoorya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

School of Computing

Department of Computer Science & Engineering

1151CS151 – COMPILER DESIGN


Category : Program Core
Credit : 3

Course Handling Faculty :


Dr. T. SAJU RAJ
Associate Professor

1/28/2025 1
COURSE CONTENT
UNIT I Introduction to Compilers 9

Compilers, Analysis of the Source Program, The Phases of a Compiler, Cousins of


the Compiler, The Grouping of Phases, Compiler-Construction Tools. LEXICAL
ANALYSIS: Need and role of lexical analyzer-Lexical errors, Input Buffering -
Specification of Tokens, Recognition of Tokens, Design of a Lexical Analyzer
Generator.

1/28/2025 T SAJU RAJ 2


COURSE CONTENT
UNIT II Syntax Analysis 9

Need and role of the parser- Context Free Grammars-Top Down parsing –
Recursive Descent Parser - Predictive Parser - LL (1) Parser -Shift Reduce Parser -
LR Parser - LR (0) item - Construction of SLR Parsing table -Introduction to LALR
Parser, YACC- Design of a syntax analyser for a sample language

1/28/2025 T SAJU RAJ 3


COURSE CONTENT
UNIT III Intermediate Code Generation L–9
Intermediate languages – Declarations – Assignment
Statements – Boolean Expressions – Case Statements – Back
patching – Procedure calls.

1/28/2025 T SAJU RAJ 4


COURSE CONTENT
UNIT IV Code Generation L–9

Issues in the design of code generator – The target machine – Runtime Storage
management – Basic Blocks and Flow Graphs – Next-use Information – A simple Code
generator – DAG representation of Basic Blocks

1/28/2025 T SAJU RAJ 5


The Phases of a Compiler:

1/28/2025 T SAJU RAJ 6


Lexical Analysis

Lexical Analysis is the first phase of the compiler also


known as a scanner.
It converts the High level input program into a
sequence of Tokens.
Lexical Analysis can be implemented with the
Deterministic finite Automata.
The output is a sequence of tokens that is sent to the
parser for syntax analysis.

1/28/2025 T SAJU RAJ 7


Syntax Analyzer

1/28/2025 T SAJU RAJ 8


Syntax Analyzer

There are three general types of parsers for grammars:


• Universal parser,
• top-down, and
• bottom-up.
Universal parsing methods such as the Cocke-Younger-Kasami algorithm and
Earley's algorithm can parse any grammar.

The methods commonly used in compilers can be classified as being either top-
down or bottom-up.

As implied by their names,

Top-down methods build parse trees from the top (root) to the bottom (leaves),
while
Bottom-up methods start from the leaves and work their way up to the root.

1/28/2025 T SAJU RAJ 9


INTERMEDIATE CODE

• Intermediate code is used to translate the source code into the


machine code.
• Intermediate code lies between the high-level language and the
machine language.

1/28/2025 T SAJU RAJ 10


CODE GENERATION

1/28/2025 T SAJU RAJ 11


➢Code generator converts the intermediate
representation of source code into a form that
can be readily executed by the machine.

➢A code generator is expected to generate the


correct code.

➢ Designing of code generator should be done


in such a way so that it can be easily
implemented, tested andT SAJU
1/28/2025
maintained.
RAJ 12
A Code Generator has 3 primary tasks

– Instruction Selection

– Register Allocation and Assignment

– Instruction Ordering

1/28/2025 T SAJU RAJ 13


Issues in the design of code generator

1 • Input to code generator

2 • Target program

3 • Instruction selection

• Memory management
4

5
• Register allocation

6
• Evaluation order

1/28/2025 T SAJU RAJ 14


Issues in the Design of a Code Generator
1)Input to the Code Generator
• We assume, front end has
– Scanned, parsed and translate the source program into a
reasonably detailed intermediate representations(IR)
– Type checking, type conversion and obvious semantic
errors have already been detected
– Symbol table is able to provide run-time address of the
data objects
– Intermediate representations may be
• Three address representation –quadruples, triples, indirect
triples.
• Linear representation -Postfix notations
• Virtual machine representation – bytecode, Stack machine code
• Graphical representation - Syntax tree, DAG

1/28/2025 T SAJU RAJ 15


Issues in the Design of a Code Generator
2)Target Programs
➢ The output of the code generator is the target program
➢ Target architecture: must be well understood
➢ Significantly influences the difficulty of code generation
➢ The most common target-machine architecture are RISC,
CISC, and stack based.

A RISC machine typically has many registers


Three-address instructions,
Simple addressing modes,
A relatively simple instruction-set architecture.

1/28/2025 T SAJU RAJ 16


A CISC machine typically has few registers,
➢Two-address instructions
➢Variety of addressing modes
➢Several register classes
➢Variable-length instructions
➢ Instruction with side effects.

1/28/2025 T SAJU RAJ 17


Issues in the Design of a Code Generator
Stack-based machine

• Operations are done by pushing operands onto a


stack and then performing the operations on the
operands at the top of the stack.
• To achieve high performance the top of the stack
is typically kept in registers.
• Stack-based machines almost disappeared
because it was felt that the stack organization
was too limiting and required too many swap and
copy operations.

1/28/2025 T SAJU RAJ 18


• Stack-based architectures were revived with the
introduction of the Java Virtual Machine (JVM)
• The JVM is a software interpreter for Java
bytecodes, an intermediate language produced
by Java compilers.
• The interpreter provides software compatibility
across multiple platforms.
• (just-in-time) JIT compilers translate bytecodes
during run time to the native hardware
instruction set of the target machine.
1/28/2025 T SAJU RAJ 19
• The output of the code generator is the target
program.
• Target program may be
– Absolute machine language
• It can be placed in a fixed location of memory and
immediately executed
– Re-locatable machine language
• Subprograms to be compiled separately
• A set of re-locatable object modules can be linked together
and loaded for execution by a linker
– Assembly language
• Easier

1/28/2025 T SAJU RAJ 20


Issues in the Design of a Code Generator
3) Instruction Selection(1)
• The code generator must map the IR program
into a code sequence that can be executed by the
target machine.
• The complexity of performing this mapping is
determined by a factors such as:-
• the level of the IR
• the nature of the instruction-set
architecture
• the desired quality of the generated code.

1/28/2025 T SAJU RAJ 21


Issues in the Design of a Code Generator
3) Instruction Selection(2)
1) Level of the IR

• If the IR is high level:-


– often produces poor code that needs further
optimization.

• If the IR is low level:-


– generate more efficient code sequences.

1/28/2025 T SAJU RAJ 22


Issues in the Design of a Code Generator
3) Instruction Selection(3)
2) Nature of the instruction-set
architecture
• For example, the uniformity and completeness
of the instruction set are important factors.
• If the target machine does not support each
data type in a uniform manner, then each
exception to the general rule requires special
handling.
• On some machines, for example, floating-
point operations are done using separate
registers.
1/28/2025 T SAJU RAJ 23
Design of a Simple Code
Generator
• When you are using the registers, cost of the instructions is very
low.

• For Example,

MOV R1, R2 - COST 1 (Register to Register)

MOV a, R1 - COST 2 (Register to Memory Variable and Vice Versa)

MOV a, b - COST 3 (Memory Variable to Memory Variable)

Presidency University, Bengaluru


MODE FORM ADDRESS COST
ABSOLUTE M M 1
REGISTER R R 0
INDEXED C(R) C+CONTENT(R) 1
INDIRECT REGISTER *R CONTENT(R) 0

INDIRECT INDEXED *C ( R ) CONTENTS(C+CONTENT(R) ) 1

LITERALS #C C 1

1/28/2025 T SAJU RAJ 25


HOW TO FIND THE INSTRUCTION COST?

Instruction cost = 1+cost of source address + cost


of destination address

1/28/2025 T SAJU RAJ 26


1/28/2025 T SAJU RAJ 27
1/28/2025 T SAJU RAJ 28
Issues in the Design of a Code Generator
3) Instruction Selection(4)
3)The quality of the generated code:

• is determined by its speed and size.


– Say for 3-addr stat a=a+1 the translated
sequence is
LD R0,a
Add R0,R0,#1
ST a,R0
– Instead ,if the target machine has increment instruction (INC),
then it would be more efficient.
– we can write inc a
– We need to know instruction costs in order to design good code
sequences
– But ,accurate cost information is often difficult to obtain.

1/28/2025 T SAJU RAJ 29


• Instruction selection is important to obtain efficient code
• Suppose we translate three-address code
x:=y+z
to: MOV y , R0
ADD z, R0 a:=a+1 MOV a,R0
MOV R0 , x ADD #1,R0
MOV R0,a
Cost = 6
Better Better

ADD #1,a INC a


Cost = 3 Cost = 2
1/28/2025 T SAJU RAJ 30
Issues in the Design of a Code Generator
4. Memory Management

➢ Mapping the names in the source program to the


addresses of data objects is done by the front end and the
code generator.

➢ A name in the three address statements refers to the


symbol table entry for name.

➢ Then from the symbol table entry, a relative address can


be determined for the name.

1/28/2025 T SAJU RAJ 31


Issues in the Design of a Code Generator
5) Register allocation (1)
• A key problem in code generation is deciding what values
to hold in what registers.
• Instructions involving
– Register operands :- are usually shorter and faster
– Memory operands :-larger and comparatively slow.
• Efficient utilization of register is particularly important in
code generation.
• The use of register is subdivided into two sub problems
– register allocation:- during which we select the set of variables
that will reside in register at a point in the program.
– register assignment:- during which we pick the specific register
that a variable will reside in.

1/28/2025 T SAJU RAJ 32


• For example certain machines require register-
pairs for some operands and results.
– M x, y multiplication instruction
– where x, the multiplicand, is the even register of
an even/odd register pair and
– y, the multiplier, is the odd register.
– The product occupies the entire even/odd register
pair.

1/28/2025 T SAJU RAJ 33


– D x, y the division instruction
– where the dividend occupies an even/odd register
pair whose even register is x;
– the divisor is y.
– After division, the even register holds the
remainder and the odd register the quotient.

1/28/2025 T SAJU RAJ 34


• Now, consider the two three-address code
sequences in which the only difference in the
second statement
t=a+b t=a+b
t=t*c t=t+c
t=t/d t=t/d
(a) (b)

1/28/2025 T SAJU RAJ 35


The shortest assembly-code sequences for (a) and (b)
are
L R1,a L R0, a
A R1,b A R0, b
M R0,c A R0, c
D R0,d SRDA R0, 32
ST R1,t D R0, d
ST R1, t
(a) (b)
Where SRDA stands for Shift-Right-Double-Arithmetic and
SRDA RO, 32 shifts the dividend into Rl and clears RO so all bits equal
its sign bit.

1/28/2025 T SAJU RAJ 36


1/28/2025 T SAJU RAJ 37
1/28/2025 T SAJU RAJ 38
Issues in the Design of a Code Generator
6) Evaluation order
• It affects the efficiency of the target code.
• Some computation orders require fewer registers
to hold intermediate results than others.
• Picking a best order in the general case is a
difficult NP-complete problem.
• Initially, we shall avoid the problem by generating
code for the three-address statements in the
order in which they have been produced by the
intermediate code generator.
1/28/2025 T SAJU RAJ 39
CONSIDER THE EXAMPLE

MOV a,R0
ADD b,R0
t1:=a+b MOV R0,t1
t2:=c+d MOV c,R1
a+b-(c+d)*e ADD d,R1
t3:=e*t2
t4:=t1-t3 MOV e,R0 MOV c,R0
MUL R1,R0 ADD d,R0
MOV t1,R1 MOV e,R1
reorder
SUB R0,R1 MUL R0,R1
t2:=c+d MOV R1,t4 MOV a,R0
t3:=e*t2 ADD b,R0
t1:=a+b SUB R1,R0
t4:=t1-t3 MOV R0,t4

1/28/2025 T SAJU RAJ 40


Role of Code Generator

➢ From IR to target program.


➢ Must preserve the semantics of the source
program.
– Meaning intended by the programmer in the
original source program should carry forward in each
compilation stage until code-generation.

➢ Target code should be of high quality


– execution time or space or energy
➢ Code generator itself should run efficiently
➢ instruction selection, register allocation and
instruction ordering.
1/28/2025 T SAJU RAJ 41
1/28/2025 T SAJU RAJ 42
1/28/2025 T SAJU RAJ 43
1/28/2025 T SAJU RAJ 44
Familiarity with the target machine and its
instruction set is a prerequisite for designing a
good code generator.
➢ The target computer is a byte-addressable
machine with 4 bytes to a word.
➢ It has n general-purpose registers, R0, R1, . . . ,
Rn-1.
➢ It has two-address instructions of the form:
op source, destination
where, op is an op-code, and source and
destination are datafields.T SAJU RAJ
1/28/2025 45
1/28/2025 T SAJU RAJ 46
Typical Target Machine

1/28/2025 T SAJU RAJ 47


• Load operations: LD r,x and LD r1, r2
• Store operations: ST x,r
• Computation operations: OP dst, src1, src2
Two Addr Inst: OP source, dest, eg: ADD a, R0
• Unconditional jumps: BR L
• Conditional jumps: Bcond r, L like BLTZ r, L

1/28/2025 T SAJU RAJ 48


Example Code Generation

1/28/2025 T SAJU RAJ 49


Instruction costs

Instruction cost = 1+cost for source and


destination address modes. This cost
corresponds to the length of the instruction.

• Address modes involving registers have cost


zero.
• Address modes involving memory location or
literal have cost one.
• Instruction length should be minimized if
space is important. Doing so, also minimizes the
time taken to fetch and perform
1/28/2025 T SAJU RAJ the instruction.
50
Instruction Operation Cost
MOV R0,R1 Store content(R0) into register R1 1

MOV R0,M Store content(R0) into memory location M 2

MOV M,R0 Store content(M) into register R0 2

MOV 4(R0),M Store contents(4+contents(R0)) into M 3

MOV *4(R0),M
Store contents(contents(4+contents(R0))) into M 3

MOV #1,R0 Store 1 into R0 2

ADD 4(R0),*12(R1) Add contents(4+contents(R0))


to contents(12+contents(R1)) T SAJU RAJ
1/28/2025
3
51
1/28/2025 T SAJU RAJ 52
Utilizing Addressing Modes

• Suppose we translate a:=b+c into


MOV b,R0
ADD c,R0
MOV R0,a
• Assuming addresses of a, b, and c are stored in R0,
R1, and R2
MOV *R1,*R0
ADD *R2,*R0
• Assuming R1 and R2 contain values of b and c
ADD R2,R1
MOV R1,a

1/28/2025 T SAJU RAJ 53


COST GENERATION EXAMPLE

1/28/2025 T SAJU RAJ 54


COST GENERATION EXAMPLE

1/28/2025 T SAJU RAJ 55


COST GENERATION EXAMPLE

1/28/2025 T SAJU RAJ 56


COST GENERATION EXAMPLE

1/28/2025 T SAJU RAJ 57


Runtime Storage management

❖ Information needed during an execution of a procedure is


kept in a block of storage called an activation record, which
includes storage for names local to the procedure.

❖ The two standard storage allocation strategies are:


STATIC ALLOCATION
STACK ALLOCAION

1/28/2025 T SAJU RAJ 58


Runtime Storage management

➢In static allocation, the position of an activation record in memory is


fixed at compile time.
➢In stack allocation, a new activation record is pushed onto the stack for
each execution of a procedure. The record is popped when the activation
ends.

STATIC ALLOCATION

STACK ALLOCAION
1/28/2025 T SAJU RAJ 59
ACTIVATION RECORD

1/28/2025 T SAJU RAJ 60


1/28/2025 T SAJU RAJ 61
Runtime Storage management

For the run-time allocation and deallocation of activation


records the following three-address statements are
associated:
1.Call
2.Return
3.Halt
4.Action, a placeholder for other statements

1/28/2025 T SAJU RAJ 62


Runtime Storage management

The run-time memory is divided into areas for:

1. Code code for function


2code for function
1 . ..
2. Static data code for function
n
3. Stack

stack

free space

heap

1/28/2025 T SAJU RAJ 63


➢ PASCAL and C use extensions of the control
stack to manage activations of procedures
➢ Stack contains information about register values,
value of program counter and data objects whose
lifetimes are contained in that of an activation
➢ Heap holds all other information. For example,
activations that cannot be represented as a
tree.
➢ By convention, stack grows down and the top of
the stack is drawn towards the bottom of this slide
(value of top is usually kept in a register)
1/28/2025 T SAJU RAJ 64
Activation Record
➢ Information needed by a single execution
of a procedure is managed using an
activation record or frame returned value
▪ Not all compilers use all of the actual parameters
fields
optional control
▪ Pascal and C push activation record on the
runtime stack when procedure is called and optionallink
access link
pop the activation record off the stack when
saved machine status
control returns to the caller
local data

temporaries

Fig: A typical Activation


Record
1/28/2025 T SAJU RAJ 65
Storage Allocation Strategies -
Static

Static allocation lays out storage for all data


objects at compile time.
Restrictions:
✓ size of object must be known and
alignment requirements must be known
at compile time.
✓ No recursion.
✓ No dynamic data structure

1/28/2025 T SAJU RAJ 66


Storage Allocation Strategies - Stack

Stack allocation manages the run time storage as


a stack.
The activation record is pushed on as a function is
entered.
The activation record is popped off as a function
exits.
Restrictions:
✓ values of locals cannot be retained when an
activation ends.
✓ A called activation cannot outlive a caller.

1/28/2025 T SAJU RAJ 67


Storage Allocation Strategies - Heap

Heap allocation –
- allocates and deallocates stroage as needed at
runtime from a data area known as heap.

Most flexible: no longer requires the activation of


procedures to be LIFO.

Most inefficient: need true dynamic memory


management.

1/28/2025 T SAJU RAJ 68


Example Stack Allocation

Program sort
var
procedure readarray;
Main
….
function partition(…)
….
readarray quicksort(1, 9)
procedure quicksort(…)
……
partition
quicksort
partition(1, 9) quicksort(1, 3)
quicksort
….
Begin
……
partition(1, 3) quicksort(1, 0)
readarray
quicksort
1/28/2025 T SAJU RAJ 69
end
Explanation - Stack Allocation

What makes this happen is known as calling sequence (how to


implement a procedure call).
A calling sequence allocates an activation record and enters
information into its fields (push the activation record).

On the opposite of the calling sequence is the return sequence.


Return sequence restores the state of the machine so that the
calling procedure can continue execution.

1/28/2025 T SAJU RAJ 70


Explanation - Sequences

A possible calling sequence:


❖ The caller evaluates actuals and push the actuals on the stack
❖ The caller saves return address(pc) the old value of sp into the
stack
❖ The caller increments the sp
❖ The callee saves registers and other status information
❖ The callee initializes its local variables and begin execution.

A possible return sequence:


❖ The callee places a return value next to the activation record of
the caller.
❖ The callee restores other registers and sp and return (jump to
pc).
❖ The caller copies the return value to its activation record.
1/28/2025 T SAJU RAJ 71
1/28/2025 T SAJU RAJ 72
1/28/2025 T SAJU RAJ 73
STACK ALLOCATION

1/28/2025 T SAJU RAJ 74


A SIMPLE CODE GENERATOR

❖A code generator generates target code for a sequence of three-


address statements and effectively uses registers to store operands of
the statements.
❖The code generator has to track both the registers (for availability)
and addresses (location of values) while generating the code

Register Address
descriptor descriptor

1/28/2025 T SAJU RAJ 75


A SIMPLE CODE GENERATOR

Register descriptor Address descriptor

➢used to inform the code


generator about the availability of ➢used to keep track of memory
registers. locations where the values of
➢keeps track of values stored in identifiers are stored.
each register.

1/28/2025 T SAJU RAJ 76


A SIMPLE CODE GENERATOR-ALGORITHM

➢Invoke a function getreg to determine the location L where the result of


the computation y op z should be stored.

➢Consult the address descriptor for y to determine y’, the current location
of y. If the value of y is not already in L, generate the instruction MOV y’ , L
to place a copy of y in L .

➢Generate the instruction OP z’ , L where z’ is a current location of z.


Update the address descriptor of x to indicate that x is in location L. If x is
in L, update its descriptor and remove x from all other descriptors .

➢If the current values of y or z have no next uses, are not live on exit from
the block, and are in register s, alter the register descriptor to indicate
that, after execution of x : = y op z , those registers will no longer contain y
or z.

1/28/2025 T SAJU RAJ 77


Generating Code for Assignment
Statements:

The assignment d : = (a-b) + (a-c) + (a-c) might be translated into


the following three address code sequence:
t:=a–b
u:=a–c
v: = t + u
d := v + u
with d live at the end.

1/28/2025 T SAJU RAJ 78


Generating Code for Assignment
Statements:

1/28/2025 T SAJU RAJ 79


1/28/2025 T SAJU RAJ 80
Code Generator Algorithm

T SAJU RAJ 81
1/28/2025
Code Generator Algorithm

T SAJU RAJ 82
1/28/2025
A SIMPLE CODE GENERATOR-EXAMPLE

INTERMEDIATE MACHINE CODE RD AD


CODE
1. X=Y+Z MOV Y,R0 R0 = X X = R0
ADD Z,R0
2. Z=X*5 MUL 5,R0 R0 = Z Z = R0
3. Y=Z–7 MOV R0,R1 R0 = Z Z = R0
SUB 7,R1 R1 = Y Y = R1

4. X=Z+Y MOV R0,Z R0 = Z Z = R0


ADD R1,R0 R1 = Y Y = R1

R0 = X X = R0
R1 = Y Y = R1(Z =MEMORY

MOV R1,Y Y=(MEMORY)


MOV R0,X Z=(MEMORY)
1/28/2025 T SAJU RAJ 83
A SIMPLE CODE GENERATOR-EXAMPLE

INTERMEDIATE CODE MACHINE CODE


1. X=Y+Z MOV Y,R0
ADD Z,R0
2. Z=X*5 MUL 5,R0
3. Y=Z–7 MOV R0,R1
SUB 7,R1
4. X=Z+Y MOV R0,Z
ADD R1,R0
MOV R1,Y
MOV R0,X

1/28/2025 T SAJU RAJ 84


Simple Code Generator - Example

T SAJU RAJ 85
1/28/2025
Simple Code Generator

T SAJU RAJ 86
1/28/2025
Simple Code Generator – Indexed Assignment

T SAJU RAJ 87
1/28/2025
Simple Code Generator - Pointer Assignment

T SAJU RAJ 88
1/28/2025
Simple Code Generator – Conditional Statements

T SAJU RAJ 89
1/28/2025
DAG representation for basic blocks

A DAG for basic block is a directed acyclic graph with


the following labels on nodes:

1.The leaves of graph are labeled by unique identifier and


that identifier can be variable names or constants.

2.Interior nodes of the graph is labeled by an operator


symbol.

3.Nodes are also given a sequence of identifiers for labels


to store the computed value.

1/28/2025 T SAJU RAJ 90


•DAGs are a type of data structure.

•It is used to implement transformations on


basic blocks.

•DAG provides a good way to determine the


common sub-expression.

•It gives a picture representation of how the


value computed by the statement is used in
subsequent statements.

1/28/2025 T SAJU RAJ 91


Algorithm for construction of DAG
Input:It contains a basic block
Output: It contains the following information:
• Each node contains a label. For leaves, the label is
an identifier.
• Each node contains a list of attached identifiers to
hold the computed values.
Case (i) x:= y OP z
Case (ii) x:= OP y
Case (iii) x:= y

1/28/2025 T SAJU RAJ 92


Method:
Step 1:
If y operand is undefined then create node(y).
If z operand is undefined then for case(i) create node(z).
Step 2:
• For case(i), create node(OP) whose right child is node(z)
and left child is node(y).
• For case(ii), check whether there is node(OP) with one
child node(y).
• For case(iii), node n will be node(y).
Output:
For node(x) delete x from the list of identifiers. Append x to
attached identifiers list for the node n found in step 2. Finally
T SAJU RAJ
1/28/2025 93
set node(x) to n.
1/28/2025 T SAJU RAJ 94
1/28/2025 T SAJU RAJ 95
1/28/2025 T SAJU RAJ 96
1/28/2025 T SAJU RAJ 97
1/28/2025 T SAJU RAJ 98
1/28/2025 T SAJU RAJ 99
1/28/2025 T SAJU RAJ 100
Application of DAG

The DAG is used in


1. Determining the common sub-expressions
2. Determining which names are used inside the block and computed
outside the block.
3. Determining which statements of the block could have their computed
value outside the block.
4. Simplifying the list of quadruples by eliminating the common sub-
expressions an; not performing the assignment of the form x:=y unless
and until it is a must.

1/28/2025 T SAJU RAJ 101


Example: DAG Construction
Consider the following expression and construct a DAG for it-
(a+b)x(a+b+c)

Three Address Code for the given expression is-

T1 = a + b
T2 = T1 + c
T3 = T1 x T2

Now, Directed Acyclic Graph is-

1/28/2025 T SAJU RAJ 102


Example: DAG Construction (contd)
Consider the following expression and construct a DAG for it-
(((a+a)+(a+a))+((a+a)+(a+a)))

1/28/2025 T SAJU RAJ 103


Example: DAG Construction (contd)
Consider the following block and construct a DAG for it-

(1) a = b x c
(2) d = b
(3) e = d x c
(4) b = e
(5) f = b + c
(6) g = f + d

1/28/2025 T SAJU RAJ 104


Example: DAG Construction
Consider the following expression and construct a DAG for it-
(a+b)x(a+b+c)

Three Address Code for the given expression is-

T1 = a + b
T2 = T1 + c
T3 = T1 x T2

Now, Directed Acyclic Graph is-

1/28/2025 T SAJU RAJ 105


Example: DAG Construction (contd)
Consider the following expression and construct a DAG for it-
(((a+a)+(a+a))+((a+a)+(a+a)))

1/28/2025 T SAJU RAJ 106


Example: DAG Construction (contd)
Consider the following block and construct a DAG for it-

(1) a = b x c
(2) d = b
(3) e = d x c
(4) b = e
(5) f = b + c
(6) g = f + d

1/28/2025 T SAJU RAJ 107


Example: DAG Construction
(contd)

The DAG for the basic block is


Consider a block

a=b+c
b=a-d
c=b+c
d=a-d

1/28/2025 T SAJU RAJ 108


Example: DAG Construction (contd)

• a = b + c;
•b=b-d
•c=c+d
•e=b+c

1/28/2025 T SAJU RAJ 109


Example: DAG Construction (contd)

1/28/2025 T SAJU RAJ 110


DAG
Representati
on of Basic
Presidency University, Bengaluru
Block
Pilani Campus
DAG Representation of
Basic Block
→ Useful data structures for implementing transformations on basic
blocks.

→ Gives a picture of how value computed by a statement is used in


subsequent statements.

→ Constructing DAG from Three address statements is good way of


determining common
sub-expressions.
A dag for a basic block has following labels on the
nodes
▪ Leaves are labeled by unique identifiers, either variable
names or constants

▪ Interior nodes are labeled by an operator symbol

▪ Nodes are also optionally given a sequence of


identifiers for labels
Presidency University, Bengaluru
DAG Representation of
Basic Block
Construct a dag for a basic block.

Process each statement of the block.

Statement of the form x := y + z

o Look for the nodes that represent the “current” values


of y and z.

o These could be leaves/interior nodes of dag.

o If y & z has been evaluated by previous statements of


block, We create a node labeled + and give it two
children (Left –node for y,right- node for z)

Presidency University, Bengaluru


DAG Representation of
Basic Block

Given a function node(identifier) – it returns the node


currently associated with the identifier.

Statement types (i) x := y op z (ii) x := op y


(iii) x := y

1. If node(y) is undefined then create a leaf labeled y and this


is now node(y) in case of x = y op z; do the same for z.

2.Determine if there is a node labeled “op” with node(y) &


node(z) as the left and right children in case of x = y op z.

3.Determine whether there is a node labeled op, whose lone


child is node(y). let n be node(y).

Presidency University, Bengaluru


DAG Representation of
Basic Block
Input - basic blocks.

Output -
o label for each node.
o label is an identifier for leaves.
o label is an operator symbols for
interior nodes.
o each node has list of identifiers.
Method -
o create nodes with one or two children left & right.
o create linked list of attached identifiers for each node.
o maintain all identifiers for which a node is associated.
o node (identifier) represents value that identifier has the
current point in dag construction process.
o Symbol table record for identifier- indicate the value of
node(identifier).

Presidency University, Bengaluru


DAG Representation of
Basic Block

Presidency University, Bengaluru


DAG Representation of
Basic Block

Presidency University, Bengaluru


DAG Representation of
Basic Block
DAGs are useful for:
o Removing common local sub-expressions.
o Renaming temporaries.
o Finding names used inside the block but
evaluated outside.

Presidency University, Bengaluru


Basic
Blocks
and
Flow
Graphs
Basic Blocks and Flow
Graphs

Basic Blocks
A basic block is a sequence of consecutive statements in
which flow of control enters at the beginning and leaves at
the end without any halt or possibility of branching
except at the end.

The following sequence of three-address statements forms a


basic block
t1 : = a * a
t2 : = a * b
t3 : = 2 * t2
Basic Blocks and Flow Graphs
Basic Block Construction:

Algorithm: Partition into basic blocks

Input: A sequence of three-address statements

Output: A list of basic blocks with each three-address statement in


exactly one block

Method:

1. We first determine the set of leaders, the first statements of basic


blocks. The rules we use are of the following:
• The first statement is a leader.
• Any statement that is the target of a conditional or unconditional
goto is a leader.
• Any statement that immediately follows a goto or conditional goto
statement is a leader.

1. For each leader, its basic block consists of the leader and all
statements up to but not including the next leader or the end of the
Basic Blocks and Flow
Graphs
Construct the Basic Blocks
for the following statements
L1: I=1
I=1
While (I < 10)
L2: if (I < 10) then
{
goto TRUE
a = b[i] + c;
i++;
L3: goto EXIT
}
Result = a + c
L4: T1 = b[i]
I=1 T2 = T1 + c
WSTART: if (I < 10) a = T2
then goto TRUE T3 = I + 1
goto EXIT I = T3
TRUE: T1 = b[i] goto WSTART
T2 = T1 + c
a = T2 L5: T4 = a + c
T3 = I + 1 Result = T4
I = T3
goto WSTART
EXIT: T4 = a + c
Result = T4
Basic Blocks and Flow
Graphs
Flow Graphs
Flow graph is a directed graph containing the flow-of-control
information for the set of basic blocks making up a program.
The
L1:nodes of theI flow
= 1 graph are basic blocks.
L2: if (I < 10) then I=1
goto TRUE
L3: goto EXIT if (I < 10) then goto
L4: T1 = b[i] TRUE
T2 = T1 + c
goto
a = T2
EXIT
T3 = I + 1
I = T3 T1 = b[i]
goto WSTART T2 = T1 + c
L5: T4 = a + c a = T2
• B1 is the initial node. B2 immediately
Result = T4 T3 = I + 1
follows B1, so there is an edge from B1 to
B2. The target of jump from last statement I = T3
of B1 is the first statement B2, so there is goto WSTARTT
an edge from B1 (last statement) to B2 (first
statement).
T4 = a + c
• B1 is the predecessor of B2, and B2 is a Result =
successor of B1. T4

You might also like