0% found this document useful (0 votes)
61 views51 pages

UNIT 4 - Chapter 1 in Compiler Design

Uploaded by

vamsimanne2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views51 pages

UNIT 4 - Chapter 1 in Compiler Design

Uploaded by

vamsimanne2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Unit 4

Chapter 1

Code Generation: Issues in the design of code generator - The target machine – Runtime
Storage management-Basic Blocks and Flow Graphs – Next – use Information – A simple Code
generator-DAG representation of Basic Blocks.
Code Generation

 The final phase in compiler model is the code generator. It takes as input an
intermediate representation of the source program and produces as output an
equivalent target program. The code generation techniques presented below can be
used whether or not an optimizing phase occurs before code generation.

 The code generated by the compiler is an object code of some lower-level programming
language, for example, assembly language.

 The source code written in a higher-level language is transformed into a lower-level


language that results in a lower-level object code, which should have the following
minimum properties:

 It should carry the exact meaning of the source code.


 It should be efficient in terms of CPU usage and memory management.
 Designing of code generator should be done in such a way so that it can be easily
implemented, tested and maintained.
The following issue arises during the code generation phase

1. Input to code generator:


The input to code generator is the intermediate code generated by the front end, along
with information in the symbol table that determines the run-time addresses of the data-
objects denoted by the names in the intermediate representation. Intermediate codes may
be represented mostly in quadruples, triples, indirect triples, Postfix notation, syntax trees,
DAG’s, etc. The code generation phase just proceeds on an assumption that the input are
free from all of syntactic and state semantic errors, the necessary type checking has taken
place and the type-conversion operators have been inserted wherever necessary.

2. Target program:
The target program is the output of the code generator. The output may be absolute
machine language, relocatable machine language, assembly language.
 Absolute machine language as output has advantages that it can be placed in a fixed
memory location and can be immediately executed.

 Relocatable machine language as an output allows subprograms and subroutines to


be compiled separately. Relocatable object modules can be linked together and
loaded by linking loader. But there is added expense of linking and loading.
 Assembly language as output makes the code generation easier. We can generate
symbolic instructions and use macro-facilities of assembler in generating code. And
we need an additional assembly step after code generation.

3. Memory Management :
Mapping the names in the source program to the addresses of data objects is done by the front
end and the code generator. A name in the three address statements refers to the symbol table
entry for name. Then from the symbol table entry, a relative address can be determined for the
name.
4. Instruction selection :
Selecting the best instructions will improve the efficiency of the program. It includes the
instructions that should be complete and uniform. Instruction speeds and machine idioms also
plays a major role when efficiency is considered. But if we do not care about the efficiency of
the target program then instruction selection is straight-forward.
For example, the respective three-address statements would be translated into the latter code
sequence as shown below:
P:=Q+R
S:=P+T
MOV Q, R0
ADD R, R0
MOV R0, P
MOV P, R0
ADD T, R0
MOV R0, S

Here the fourth statement is redundant as the value of the P is loaded again in that statement
that just has been stored in the previous statement. It leads to an inefficient code sequence. A
given intermediate representation can be translated into many code sequences, with significant
cost differences between the different implementations. A prior knowledge of instruction cost
is needed in order to design good sequences, but accurate cost information is difficult to
predict.

5. Register allocation issues:

Use of registers make the computations faster in comparison to that of memory, so efficient
utilization of registers is important. The use of registers are subdivided into two sub problems:
 During Register allocation – we select only those set of variables that will reside in the
registers at each point in the program.

 During a subsequent Register assignment phase, the specific register is picked to access
the variable.
As the number of variables increases, the optimal assignment of registers to variables becomes
difficult. Mathematically, this problem becomes NP-complete. Certain machine requires
register pairs consist of an even and next odd-numbered register.
For example
M a, b
These types of multiplicative instruction involve register pairs where the multiplicand is an even
register and b, the multiplier is the odd register of the even/odd register pair.

6. Evaluation order :

The code generator decides the order in which the instruction will be executed. The order of
computations affects the efficiency of the target code. Among many computational orders,
some will require only fewer registers to hold the intermediate results. However, picking the
best order in the general case is a difficult NP-complete problem.

7. Approaches to code generation issues:


Code generator must always generate the correct code. It is essential because of the number of
special cases that a code generator might face. Some of the design goals of code generator are:
 Correct
 Easily maintainable
 Testable
 Efficient
The Target Machine
o The target computer is a type of byte-addressable machine.
o It has 4 bytes to a word.
o The target machine has n general purpose registers, R0, R1, ...., Rn-1.
o It also has two-address instructions of the form:

o Where, op is used as an op-code and source and destination are used as a data field.
o It has the following op-codes:

ADD (add source to destination)

SUB (subtract source from destination)

MOV (move source to destination)


o The source and destination of an instruction can be specified by the combination of
registers and memory location with address modes.

o For example : contents(a) denotes the contents of the register or memory address
represented by a. A memory location M or a register R represents itself when used as a
source or destination.
o e.g. MOV R0,M →stores the content of register R0 into memory location M.
Instruction costs:
o Instruction cost = 1 + cost for source and destination address modes. This cost
corresponds to the length of the instruction.
o Here, cost 1 means that it occupies only one word of memory.
o Address modes involving registers have cost zero. Address modes involving memory
location or literal have cost one. Instruction length should be minimized if space is
important. Doing so also minimizes the time taken to fetch and perform the instruction.

For example :

1. The instruction MOV R0, R1 copies the contents of register R0 into R1. It has cost
one, since it occupies only one word of memory.
2. The (store) instruction MOV R5, M copies the contents of register R5 into
memory location M. This instruction has cost two, since the address of memory
location M is in the word following the instruction.
3. The instruction ADD #1,R3 adds the constant 1 to the contents of register 3,and has
cost two, since the constant 1 must appear in the next word following the
instruction.
4. The instruction SUB 4(R0),*12(R1) stores the value
contents(contents(12+contents(R1)))-contents(contents(4+R0)) into the destination
*12(R1). Cost of this instruction is three, since the constant 4 and 12 are stored in
the next two words following the instruction.

Example:
MOV R0, R1 copies the contents of register R0 into R1. It has cost one, since it occupies only one
word of memory.

The three-address statement a : = b + c can be implemented by many different instruction


sequences :
MOV b, R0
ADD c, R0
MOV R0, a cost = 6

MOV b, a
ADD c, a cost = 6
Run Time Environments

 A translation needs to relate the static source text of a program to the dynamic actions
that must occur at runtime to implement the program. The program consists of names
for procedures, identifiers etc., that require mapping with the actual memory location at
runtime.
 Runtime environment is a state of the target machine, which may include software
libraries, environment variables, etc., to provide services to the processes running in
the system.
Source Language issues
1. Activation Trees:
 A program consists of procedures, a procedure definition is a declaration that, in its
simplest form, associates an identifier (procedure name) with a statement (body of the
procedure). Each execution of procedure is referred to as an activation of the
procedure. Lifetime of an activation is the sequence of steps present in the execution of
the procedure. If ‘a’ and ‘b’ be two procedures then their activations will be non-
overlapping (when one is called after other) or nested (nested procedures). A procedure
is recursive if a new activation begins before an earlier activation of the same procedure
has ended.
 An activation tree shows the way control enters and leaves activations.
 Properties of activation trees are: -
o Each node represents an activation of a procedure.
o The root shows the activation of the main function.
o The node for procedure ‘x’ is the parent of node for procedure ‘y’ if and only if
the control flows from procedure x to procedure y.

 Whenever a procedure is executed, its activation record is stored on the stack, also
known as control stack. When a procedure calls another procedure, the execution of the
caller is suspended until the called procedure finishes execution. At this time, the
activation record of the called procedure is stored on the stack.
 We assume that the program control flows in a sequential manner and when a
procedure is called, its control is transferred to the called procedure. When a called
procedure is executed, it returns the control back to the caller. This type of control flow
makes it easier to represent a series of activations in the form of a tree, known as
the activation tree.
 To understand this concept, we take a piece of code as an example:

...
printf(“EnterYourName:“);
scanf(“%s”, username);
show_data(username);
printf(“Press any key to continue…”);
...
Int show_data(char*user)
{
printf(“Your name is%s”, username);
return0;
}
...

 Below is the activation tree of the code given.

 Now we understand that procedures are executed in depth-first manner, thus stack
allocation is the best suitable form of storage for procedure activations.
2. Activation Record:
 The execution of a procedure is called its activation. An activation record contains all
the necessary information required to call a procedure. An activation record may
contain the following units (depending upon the source language used).

Temporaries Stores temporary and intermediate values of an expression.

Local Data Stores local data of the called procedure.

Machine Status Stores machine status such as Registers, Program Counter


etc., before the procedure is called.

Control Link Stores the address of activation record of the caller


procedure.

Access Link Stores the information of data which is outside the local
scope.

Actual Parameters Stores actual parameters, i.e., parameters which are used
to send input to the called procedure.

Return Value Stores return values.


Storage Organization (or) Storage Allocation strategies (or) Runtime Storage Management
(or) Storage allocation Tequiniques

 Runtime storage can be subdivided to hold:


o Target code- the program code, it is static as its size can be determined at
compile time.
o Static data objects.
o Dynamic data objects- heap.
o Automatic data objects- stack.
 Figure (a):

 Figure (b):
 As shown in the image above, the text part of the code is allocated a fixed amount of
memory. Stack and heap memory are arranged at the extremes of total memory
allocated to the program. Both shrink and grow against each other.

 I. Static Storage Allocation: In this allocation scheme, the compilation data is bound to a
fixed location in the memory and it does not change when the program executes. As the
memory requirement and storage locations are known in advance, runtime support
package for memory allocation and de-allocation is not required.
o For any program if we create memory at compile time, memory will be created
in the static area.
o For any program if we create memory at compile time only, memory is created
only once.
o It don’t support dynamic data structure i.e memory is created at compile time
and deallocated after program completion.
o The drawback with static storage allocation is recursion is not supported.
o Another drawback is size of data should be known at compile time.
o Eg- FORTRAN was designed to permit static storage allocation.
 II. Stack Storage Allocation: Procedure calls and their activations are managed by
means of stack memory allocation. It works in last-in-first-out (LIFO) method and this
allocation strategy is very useful for recursive procedure calls.
o Storage is organised as a stack and activation records are pushed and popped as
activation begin and end respectively. Locals are contained in activation records
so they are bound to fresh storage in each activation.
o Recursion is supported in stack allocation.
 III. Heap Storage Allocation: Variables local to a procedure are allocated and de-
allocated only at runtime. Heap allocation is used to dynamically allocate memory to the
variables and claim it back when the variables are no more required. Except statically
allocated memory area, both stack and heap memory can grow and shrink dynamically
and unexpectedly. Therefore, they cannot be provided with a fixed amount of memory
in the system.

o Memory allocation and deallocation can be done at any time and at any place
depending on the requirement of the user.
o Heap allocation is used to dynamically allocate memory to the variables and
claim it back when the variables are no more required.
o Recursion is supported.
Access to non-local names

 In some cases, when a procedure refers to variables that are not local to it, then such
variables are called nonlocal variables.
 There are two types of scope rules, for the non-local names. They are Static scope and
Dynamic scope.

 Scoping is generally divided into two classes:

1. Static Scoping

2. Dynamic Scoping

 Static Scoping:

Static scoping is also called lexical scoping. In this scoping a variable always refers to its
top level environment. This is a property of the program text and unrelated to the run
time call stack. Static scoping also makes it much easier to make a modular code as
programmer can figure out the scope just by looking at the code. In contrast, dynamic
scope requires the programmer to anticipate all possible dynamic contexts.
 In most of the programming languages including C, C++ and Java, variables are always
statically (or lexically) scoped i.e., binding of a variable can be determined by program
text and is independent of the run-time function call stack.
 For example, output for the below program is 10, i.e., the value returned by f() is not
dependent on who is calling it (Like g() calls it and has a x with value 20). f() always
returns the value of global variable x.

 Dynamic Scoping:

With dynamic scope, a global identifier refers to the identifier associated with the most
recent environment, and is uncommon in modern languages. In technical terms, this
means that each identifier has a global stack of bindings and the occurrence of a
identifier is searched in the most recent binding.
 In simpler terms, in dynamic scoping the compiler first searches the current block and
then successively all the calling functions.
Parameter passing methods
 The communication medium among procedures is known as parameter passing. The
values of the variables from a calling procedure are transferred to the called procedure
by some mechanism. Before moving ahead, first go through some basic terminologies
pertaining to the values in a program.
 r-value :
The value of an expression is called its r-value. The value contained in a single variable
also becomes an r-value if it appears on the right-hand side of the assignment operator.
r-values can always be assigned to some other variable.
 l-value:
The location of memory (address) where an expression is stored is known as the l-value
of that expression. It always appears at the left hand side of an assignment operator.
 For example:
day =1;
week = day *7;
month =1;
year = month *12;

 From this example, we understand that constant values like 1, 7, 12, and variables like
day, week, month and year, all have r-values. Only variables have l-values as they also
represent the memory location assigned to them.
 For example:
7= x + y;
 is an l-value error, as the constant 7 does not represent any memory location.

 Formal Parameters:
Variables that take the information passed by the caller procedure are called formal
parameters. These variables are declared in the definition of the called function.
 Actual Parameters:
Variables whose values or addresses are being passed to the called procedure are called
actual parameters. These variables are specified in the function call as arguments.
Example:
fun_one()
{
intactual_parameter=10;
call fun_two(intactual_parameter);
}
fun_two(intformal_parameter)
{
printformal_parameter;
}

 Formal parameters hold the information of the actual parameter, depending upon the
parameter passing technique used. It may be a value or an address.
 Pass by Value:
In pass by value mechanism, the calling procedure passes the r-value of actual
parameters and the compiler puts that into the called procedure’s activation record.
Formal parameters then hold the values passed by the calling procedure. If the values
held by the formal parameters are changed, it should have no impact on the actual
parameters.
 Pass by Reference:
In pass by reference mechanism, the l-value of the actual parameter is copied to the
activation record of the called procedure. This way, the called procedure now has the
address (memory location) of the actual parameter and the formal parameter refers to
the same memory location. Therefore, if the value pointed by the formal parameter is
changed, the impact should be seen on the actual parameter as they should also point
to the same value.
 Pass by Copy-restore:
This parameter passing mechanism works similar to ‘pass-by-reference’ except that the
changes to actual parameters are made when the called procedure ends. Upon function
call, the values of actual parameters are copied in the activation record of the called
procedure. Formal parameters if manipulated have no real-time effect on actual
parameters (as l-values are passed), but when the called procedure ends, the l-values of
formal parameters are copied to the l-values of actual parameters.
Example:
int y;
calling_procedure()
{
y =10;
copy_restore(y);//l-value of y is passed
printf y;//prints 99
}
copy_restore(int x)
{
x =99;// y still has value 10 (unaffected)
y =0;// y is now 0
}

 When this function ends, the l-value of formal parameter x is copied to the actual
parameter y. Even if the value of y is changed before the procedure ends, the l-value of
x is copied to the l-value of y making it behave like call by reference.
 Pass by Name:
Languages like Algol provide a new kind of parameter passing mechanism that works
like preprocessor in C language. In pass by name mechanism, the name of the procedure
being called is replaced by its actual body. Pass-by-name textually substitutes the
argument expressions in a procedure call for the corresponding parameters in the body
of the procedure so that it can now work on actual parameters, much like pass-by-
reference.
Basic Blocks

Basic Block:

 Basic block contains a collection of statements which are executed in a sequence . The
flow of control enters at the beginning of the statement and leave at the end without
any halt (except may be the last instruction of the block).

Basic block construction:


 Algorithm: Partition three address statements into basic blocks
 Input: It contains the sequence of three address statements
 Output: it contains a list of basic blocks with each three address statement in exactly
one block.
 Method:
First identify the leader in the code. The rules for finding leaders are as follows:
o The first statement is a leader.
o Statement L is a leader if there is a conditional or unconditional goto statement
like:
if ... goto L or goto L.
o Instruction L is a leader if it immediately follows a goto or conditional goto
statement like:
if goto B or goto B.
(OR)
o The first instruction in the intermediate code will always be a leader.
o The instructions that target a conditional or unconditional jump statement are
termed as a leader.
o Any instructions that are just after a conditional or unconditional jump
statement will be a leader.
 For each leader, its basic block consists of the leader and all statement up to but it
doesn't include the next leader or end of the program.
Example 1:
 The following sequence of three address statements forms a basic block:

 The basic block B1 contains the statement (1) to (6).

Example 2:
 Consider the following source code for dot product of two vectors a and b of length 10:
 The three address code for the above source program is given below:
1. begin
2. prod :=0; 3. i:=1;
4. do begin
5. prod :=prod+ a[i] * b[i];
6. i :=i+1;
7. end
8. while i <= 10
9. end
 (1) prod :=0
(2) i :=1
(3) t1 :=4*i
(4) t2 :=a[t1]
(5) t3 :=4*i
(6) t4 :=b[t3]
(7) t5 :=t2*t4
(8) t6 :=prod+t5
(9) prod:=t6
(10) t7:=i+1
(11) i:=t7
(12) if i<=10 go to (3)
 So there are 2 basic blocks for the above code.

 Example 3:
 Consider the following source code for a 10 x 10 matrix to an identity matrix.
 The three address code(Optimized Code) for the above source program is given below:

 According to the given algorithm:


 instruction 1 is a leader.
 Instruction 2 is also a leader because this instruction is the target for instruction 11.
 Instruction 3 is also a leader because this instruction is the target for instruction 9.
 Instruction 10 is also a leader because it immediately follows the conditional goto
statement.
 Similar to step 4, instruction 12 is also a leader.
 Instruction 13 is also a leader because this instruction is the target for instruction 17.
 So there are six basic blocks for the above code, which are given below:
 B1 for statement 1
 B2 for statement 2
 B3 for statement 3-9
 B4 for statement 10-11
 B5 for statement 12
 B6 for statement 13-17.
 Example 4:
 Three Address Code for the expression a = b + c + d is-

 The basic block B1 contains the statement (1) to (3).


 So there are 1 basic blocks for the above code.

 Example 5:

 The basic block B1 contains the statement (1) to (4).


 The basic block B2 contains the statement (5) to (6).
 The basic block B3 contains the statement (7) to (8).
 The basic block B4 contains the statement (9) .
 So there are 4 basic blocks for the above code
Flow Graph
 Flow graph is a directed graph. It contains the flow of control information for the set
of basic block.
 A control flow graph is used to depict that how the program control is being parsed
among the blocks. It is useful in the loop optimization.
 Example 1:
 Flow graph for the vector dot product is given as follows:

 Block B1 is the initial node. Block B2 immediately follows B1, so from B2 to B1 there is
an edge.
 The target of jump from last statement of B1 is the first statement B2, so from B1 to B2
there is an edge.
 B2 is a successor of B1 and B1 is the predecessor of B2.
 Example 2:

Fig :Flow graph for the 10 x 10 matrix to an identity matrix.


Block B1 is the entry point for the flow graph because B1 contains starting instruction.
B2 is the only successor of B1 because B1 doesn’t end with unconditional jumps, and the
B2 block’s leader immediately follows the B1 block’s leader.
B3 block has two successors. One is a block B3 itself because the first instruction of the
B3 block is the target for the conditional jump in the last instruction of block B3.
Another successor is block B4 due to conditional jump at the end of B3 block.
B6 block is the exit point of the flow graph.

 Example 3:
Transformations on Basic Blocks

 A basic block computes a set of expressions, these expressions are the values of
thenames live on exit from the block.
 A number of transformations can be applied to a basic block without changing the set
of expressions computed by the block.
 Many of these transformations are useful for improving the quality of code that will be
ultimately generated from a basic block.
 The primary structure-preserving transformations on basic blocks are:
I. common sub expression elimination
2. dead-code elimination
3. renaming of temporary variables
4. interchange of two independent adjacent statements
5. Algebraic transformations

 Example: common sub expression elimination


Example: Dead-code elimination

Example: Renaming of temporary variables

Example: Interchange of two independent adjacent statements


Example: Algebraic Transformations

 Countless algebraic transformations can be used to change the set of


expressions computed by a basic block into an algebraically equivalent set.
 The useful ones are those that simplify expressions or replace expensive operations by
cheaper ones.

Example 1

Example 2
Next-use Information
 The next-use information will indicate the statement number at which a particular variable that is
defined in the current position will be reused.

 Knowing when the value of a variable will be used next is important for generating good
code.

 Suppose a statement 𝐼 defines 𝑥

 If a statement 𝐽uses 𝑥 as an operand, and control can flow from 𝐼 to 𝐽 along a path
where 𝑥 is not redefined, then 𝐽uses the value of 𝑥 defined at 𝐼

 𝑥 is live at statement 𝐼
 Description:
 Step 1 :

 Step 2 :
 Step 3 :

 Step 4 :
 Step 5 :
A simple Code generator
 A code generator generates target code for a sequence of three- address statements and
effectively uses registers to store operands of the statements.
 The code generation algorithm uses the following registers: Register and Address Descriptors.
 Register descriptor: Register descriptor is used to inform the code generator about the
availability of registers. Register descriptor keeps track of values stored in each register.
Whenever a new register is required during code generation, this descriptor is consulted
for register availability.
 Address descriptor: Values of the names (identifiers) used in the program might be
stored at different locations while in execution. Address descriptors are used to keep
track of memory locations where the values of identifiers are stored. These locations
may include CPU registers, heaps, stacks, memory or a combination of the mentioned
locations.
 A Code Generation Algorithm:
 In generally, A code-generation algorithm takes a sequence of three-address statements
as input.
 Code generator keeps both the descriptor updated in real-time. For a load statement,
LD R1, x the code generator:
 updates the Register Descriptor R1 that has value of x and
 updates the Address Descriptor (x) to show that one instance of x is in R1.
 Note: If the value of a name is found at more than one place (register, cache, or
memory), the register’s value will be preferred over the cache and main memory.
Likewise cache’s value will be preferred over the main memory. Main memory is barely
given any preference.
 getReg(): Code generator uses getReg function to determine the status of
available registers and the location of name values.
 getReg() works as follows: 
o If variable Y is already in register R, it uses that register.
o Else if some register R is available, it uses that register.
o Else if both the above options are not possible, it chooses a register that
requires minimal number of load and store instructions.

Example :
 For an instruction x = y OP z, the code generator may perform the following actions. 
Letus assume that L is the location (preferably register) where the output of y OP z is to
besaved:
1) Call function getReg(), to decide the location of L.

2) Determine the present location (register or memory) of y by consulting the Address Descriptor of
y. If y is not presently in register L, then generate the following instruction to copy the value of y
to L: 
MOV y’, L
where y’ represents the copied value of y.
3) Determine the present location of z using the same method used in step 2 for y and
generate the following instruction: 
OP z’, L
where z’ represents the copied value of z.
4) Now L contains the value of y OP z, that is intended to be assigned to x. So, if L is a register,
update its descriptor to indicate that it contains the value of x. Update the descriptor of x to
indicate that it is stored at location L.
5) If y and z has no further use, then update the descriptors to remove y and z i.e they can be given
back to the system.
Other code constructs like loops and conditional statements are transformed into assembly
language in general assembly way. 
 Example : Generating Code for Assignment Statements:
 The assignment statement d:= (a-b) + (a-c) + (a-c) can be translated into the following
sequence of three address code:

 Code sequence for the example is as follows:


DAG representation of Basic Blocks
 DAGs are a type of data structure.
 It is used to implement transformations on basic blocks.
 DAG provides a good way to determine the common sub-expression.
 DAG representation or Directed Acyclic Graph representation is used to represent the
structure of basic blocks. A basic block is a set of statements that execute one after
another in sequence. It displays the flow of values between basic blocks and provides
basic block optimization algorithms. 
 It gives a picture representation of how the value computed by the statement is used in
subsequent statements

Rule-01:
In a DAG,

 Interior nodes always represent the operators.


 Exterior nodes (leaf nodes) always represent the names, identifiers or constants.
 Interior nodes also represent the result of expression.
Rule-02:

While constructing a DAG,

 A check is made to find if there exists any node with the same value.
 A new node is created only when there does not exist any node with the same value.
 This action helps in detecting the common sub-expressions and avoiding the re-
computation of the same.

Rule-03:

 The assignment instructions of the form x:=y are not performed unless they are
necessary.
 Example:
 Consider the following three address statement:

 Stages in DAG Construction:


Example :
T0 = a + b —Expression 1
T1 = T 0 + c —-Expression 2
d = T0 + T1 —–Expression 3

Expression 1 : T0 = a + b

Expression 2: T1 = T0 + c
Expression 3 : d = T0 + T1

Final Directed acyclic graph

Example :
T1 = a + b
T2 = T1 + c
T3 = T1 xT2
Example :
T1 = a + b
T2 = a – b
T3 = T 1 * T2
T4 = T 1 – T3
T5 = T 4 + T3

Final Directed acyclic graph


Example :
a=bxc
d=b
e=dxc
b=e
f=b+c
g=f+d

Final Directed acyclic graph


Example :
T1 := 4*I0
T2 := a[T 1 ]
T3 := 4*I0
T4 := b[T 3 ]
T5 := T 2 * T4
T6 := prod + T 5
prod:= T6
T7 := I0 + 1
I0 := T 7
if I0 <= 20 goto 1

Final Directed acyclic graph

Application of Directed Acyclic Graph:


 Directed acyclic graph determines the subexpressions that are commonly used.
 Directed acyclic graph determines the names used within the block as well as the
names computed outside the block.
 Determines which statements in the block may have their computed value outside
the block.
 Code can be represented by a Directed acyclic graph that describes the inputs and
outputs of each of the arithmetic operations performed within the code; this
representation allows the compiler to perform common subexpression elimination
efficiently.
 Several programming languages describe value systems that are linked together by
a directed acyclic graph. When one value changes, its successors are recalculated;
each value in the DAG is evaluated as a function of its predecessors.

You might also like