Unit-V Control /data Flow Analysis
Unit-V Control /data Flow Analysis
We can add flow control information to the set of basic blocks making up a program by
constructing a directed graph called a flow graph. The nodes of a flow graph are the basic nodes.
One node is distinguished as initial; it is the block whose leader is the first statement. There is a
directed edge from block B1 to block B2 if B2 can immediately follow B1 in some execution
sequence; that is, if
- There is conditional or unconditional jump from the last statement of B1 to the first
statement of B2 , or
- B2 immediately follows B1 in the order of the program, and B1 does not end in an
unconditional jump. We say that B1 is the predecessor of B 2,and B 2 is a successor of B1.
We wish to determine for each three-address statement x := y op z, what the next uses of
x, y and z are. We collect next-use information about names in basic blocks. If the name in a
register is no longer needed, then the register can be assigned to some other name. This idea of
keeping a name in storage only if it will be used subsequently can be applied in a number of
contexts. It is used to assign space for attribute values.
The simple code generator applies it to register assignment. Our algorithm is to determine
next uses makes a backward pass over each basic block, recording (in the symbol table) for each
name x whether x has a next use in the block and if not, whether it is live on exit from that block.
We can assume that all non-temporary variables are live on exit and all temporary variables are
dead on exit.
1. Attach to statement i the information currently found in the symbol table regarding the
next use and live ness of x, y and z.
2. In the symbol table, set x to "not live" and "no next use".
3. In the symbol table, set y and z to "live" and the next uses of y and z to i. Note that the
order of steps (2) and (3) may not be interchanged because x may be y or z.
If three-address statement i is of the form x := y or x := op y, the steps are the same as above,
ignoring z. consider the below example:
1: t1 = a * a
2: t 2 = a * b
3: t3 = 2 * t2
4: t4 = t 1 + t3
5: t5 = b * b
6: t6 = t 4 + t5
7: X = t 6
Example :
We can allocate storage locations for temporaries by examining each in turn and
assigning a temporary to the first location in the field for temporaries that does not contain a live
temporary. If a temporary cannot be assigned to any previously created location, add a new
location to the data area for the current procedure. In many cases, temporaries can be packed into
registers rather than memory locations, as in the next section.
Example .
The six temporaries in the basic block can be packed into two locations. These locations
correspond to t 1 and t 2 in:
6: t1 = t1 + t 2 ,7: X = t1
Data analysis is needed for global code optimization, e.g.: Is a variable live on exit from a block?
Does a definition reach a certain point in the code? Data flow equations are used to collect
dataflow information A typical dataflow equation has the form
Out[s]=Gen[s]U(in[s]-kill[s])
The notion of generation and killing depends on the dataflow analysis problem to be
solved Let's first consider Reaching Definitions analysis for structured programs A definition of
a variable x is a statement that assigns or may assign a value to x An assignment to x is an
unambiguous definition of x An ambiguous assignment to x can be an assignment to a pointer or
a function call where x is passed by reference When x is defined, we say the definition is
generated An unambiguous definition of x kills all other definitions of x When all definitions of
x are the same at a certain point, we can use this information to do some optimizations Example:
all definitions of x define x to be 1. Now, by performing constant folding, we can do strength
reduction if x is used in z=x*y.
First, divide the code above into basic blocks. Now calculate the available expressions
for each block. Then find an expression available in a block and perform step 2c above.
What common subexpression can you share between the two blocks? What if the above
code were:
main:
BeginFunc 28;
b=a+2;
c=4*b;
tmp1 = b < c ;
IfNZ tmp1 Goto L1 ;
b=1;
z = a + 2 ; <========= an additional line here
L1:
d=a+2;
EndFunc ;
main()
{
int x, y, z;
x = (1+20)* -x;
y = x*x+(x/y);
y = z = (x/y)/(x*x);
}
straight translation:
tmp1 = 1 + 20 ;
tmp2 = -x ;
x = tmp1 * tmp2 ;
tmp3 = x * x ;
tmp4 = x / y ;
y = tmp3 + tmp4 ;
tmp5 = x / y ;
tmp6 = x * x ;
z = tmp5 / tmp6 ;
y=z;
What sub-expressions can be eliminated? How can valid common sub-expressions (live ones) be
determined? Here is an optimized version, after constant folding and propagation and elimination
of common sub-expressions:
tmp2 = -x ;
x = 21 * tmp2 ;
tmp3 = x * x ;
tmp4 = x / y ;
y = tmp3 + tmp4 ;
tmp5 = x / y ;
z = tmp5 / tmp3 ;
y=z;
de f[B] is the set of variables assigned values in B prior to any use of that variable in B use [B]
is the set of variables whose values may be used in [B] prior to any definition of the variable.
A variable comes live into a block (in in[B]), if it is either used before redefinition of it is
live coming out of the block and is not redefined in the block .A variable comes live out of a
block (in out[B]) if and only if it is live coming into one of its successors
Out[B]= U in[s]
S succ[B]
Note the relation between reaching-definitions equations: the roles of in and out are interchanged
Copy Propagation
This optimization is similar to constant propagation, but generalized to non-constant
values. If we have an assignment a = b in our instruction stream, we can replace later
occurrences of a with b (assuming there are no changes to either variable in-between). Given the
way we generate TAC code, this is a particularly valuable optimization since it is able to
eliminate a large number of instructions that only serve to copy values from one variable to
another. The code on the left makes a copy of tmp1 in tmp2 and a copy of tmp3 in tmp4. In the
optimized version on the right, we eliminated those unnecessary copies and propagated the
original variable into the later uses:
tmp2 = tmp1 ;
tmp3 = tmp2 * tmp1;
tmp4 = tmp3 ;
tmp5 = tmp3 * tmp2 ;
c = tmp5 + tmp4 ;
tmp3 = tmp1 * tmp1 ;
tmp5 = tmp3 * tmp1 ;
c = tmp5 + tmp3 ;
We can also drive this optimization "backwards", where we can recognize that the original
assignment made to a temporary can be eliminated in favor of direct assignment to the final goal:
tmp1 = LCall _Binky ;
a = tmp1;
tmp2 = LCall _Winky ;
b = tmp2 ;
tmp3 = a * b ;
c = tmp3 ;
a = LCall _Binky;
b = LCall _Winky;
c=a*b;
IMPORTANT QUESTIONS:
ASSIGNMENT QUESTIONS:
In final code generation, there is a lot of opportunity for cleverness in generating efficient
target code. In this pass, specific machines features (specialized instructions, hardware pipeline
abilities, register details) are taken into account to produce code optimized for this particular
architecture.
Register Allocation
i = 10;
j = 20;
x = i + j;
y = j + k;
We say that i interferes with j because at least one pair of i‘s definitions and uses is
separated by a definition or use of j, thus, i and j are "alive" at the same time. A variable is alive
between the time it has been defined and that definition‘s last use, after which the variable is
dead. If two variables interfere, then we cannot use the same register for each. But two variables
that don't interfere can since there is no overlap in the liveness and can occupy the same register.
Once we have the interference graph constructed, we r-color it so that no two adjacent nodes
share the same color (r is the number of registers we have, each color represents a different
register). You may recall that graph-coloring is NP-complete, so we employ a heuristic rather
than an optimal algorithm. Here is a simplified version of something that might be used:
1. Find the node with the least neighbors. (Break ties arbitrarily.)
2. Remove it from the interference graph and push it onto a stack
3. Repeat steps 1 and 2 until the graph is empty.
4. Now, rebuild the graph as follows:
a. Take the top node off the stack and reinsert it into the graph
b. Choose a color for it based on the color of any of its neighbors presently in the
graph, rotating colors in case there is more than one choice.
c. Repeat a and b until the graph is either completely rebuilt, or there is no color
available to color the node.
If we get stuck, then the graph may not be r-colorable, we could try again with a different
heuristic, say reusing colors as often as possible. If no other choice, we have to spill a variable to
memory.
Instruction Scheduling:
Another extremely important optimization of the final code generator is instruction
scheduling. Because many machines, including most RISC architectures, have some sort of
pipelining capability, effectively harnessing that capability requires judicious ordering of
instructions. In MIPS, each instruction is issued in one cycle, but some take multiple cycles to
complete. It takes an additional cycle before the value of a load is available and two cycles for a
branch to reach its destination, but an instruction can be placed in the "delay slot" after a branch
and executed in that slack time. On the left is one arrangement of a set of instructions that
requires 7 cycles. It assumes no hardware interlock and thus explicitly stalls between the second
and third slots while the load completes and has a Dead cycle after the branch because the delay
slot holds a noop. On the right, a more Favorable rearrangement of the same instructions will
execute in 5 cycles with no dead Cycles.
lw $t2, 4($fp)
lw $t3, 8($fp)
noop
add $t4, $t2, $t3
subi $t5, $t5, 1
goto L1
noop
lw $t2, 4($fp)
lw $t3, 8($fp)
subi $t5, $t5, 1
goto L1
add $t4, $t2, $t3
Register Allocation
i = 10;
j = 20;
x = i + j;
y = j + k;
We say that i interferes with j because at least one pair of i‘s definitions and uses is
separated by a definition or use of j, thus, i and j are "alive" at the same time. A variable is alive
between the time it has been defined and that definition‘s last use, after which the variable is
dead. If two variables interfere, then we cannot use the same register for each. But two variables
that don't interfere can since there is no overlap in the liveness and can occupy the same register.
Once we have the interference graph constructed, we r-color it so that no two adjacent nodes
share the same color (r is the number of registers we have, each color represents a different
register). You may recall that graph-coloring is NP-complete, so we employ a heuristic rather
than an optimal algorithm. Here is a simplified version of something that might be used:
1. Find the node with the least neighbors. (Break ties arbitrarily.)
2. Remove it from the interference graph and push it onto a stack
3. Repeat steps 1 and 2 until the graph is empty.
4. Now, rebuild the graph as follows:
a. Take the top node off the stack and reinsert it into the graph
b. Choose a color for it based on the color of any of its neighbors presently in the graph,
rotating colors in case there is more than one choice.
c. Repeat a and b until the graph is either completely rebuilt, or there is no color available
to color the node.
If we get stuck, then the graph may not be r-colorable, we could try again with a different
heuristic, say reusing colors as often as possible. If no other choice, we have to spill a variable to
memory.
CODE GENERATION:
The code generator generates target code for a sequence of three-address statement. It
considers each statement in turn, remembering if any of the operands of the statement are
currently in registers, and taking advantage of that fact, if possible. The code-generation uses
descriptors to keep track of register contents and addresses for names.
1. A register descriptor keeps track of what is currently in each register. It is consulted whenever
a new register is needed. We assume that initially the register descriptor shows that all registers
are empty. (If registers are assigned across blocks, this would not be the case). As the code
generation for the block progresses, each register will hold the value of zero or more names at
any given time.
2. An address descriptor keeps track of the location (or locations) where the current value of the
name can be found at run time. The location might be a register, a stack location, a memory
address, or some set of these, since when copied, a value also stays where it was. This
information can be stored in the symbol table and is used to determine the accessing method for
a name.
for each X = Y op Z do
Mov Y', L
- Generate
op Z', L
Again prefer a register for Z. Update address descriptor of X to indicate X is in L. If L is a
register update its descriptor to indicate that it contains X and remove X from all other register
descriptors.
. If current value of Y and/or Z has no next use and are dead on exit from block and are in
registers, change register descriptor to indicate that they no longer contain Y and/or Z.
The code generation algorithm takes as input a sequence of three-address statements constituting
a basic block. For each three-address statement of the form x := y op z we perform the following
actions:
1. Invoke a function getreg to determine the location L where the result of the
computation y op z should be stored. L will usually be a register, but it could also be a
memory location. We shall describe getreg shortly.
2. Consult the address descriptor for u to determine y', (one of) the current location(s) of
y. Prefer the register for y' if the value of y is currently both in memory and a register. If
the value of u is not already in L, generate the instruction MOV y', L to place a copy of y
in L.
3. Generate the instruction OP z', L where z' is a current location of z. Again, prefer a
register to a memory location if z is in both. Update the address descriptor to indicate that
x is in location L. If L is a register, update its descriptor to indicate that it contains the
value of x, and remove x from all other register descriptors.
4. If the current values of y and/or y have no next uses, are not live on exit from the
block, and are in registers, alter the register descriptor to indicate that, after execution of
x := y op z, those registers no longer will contain y and/or z, respectively.
FUNCTION getreg:
1. If Y is in register (that holds no other values) and Y is not live and has no next use after
X = Y op Z
then return register of Y for L.
2. Failing (1) return an empty register
3. Failing (2) if X has a next use in the block or op requires register then get a register R, store its
content into M (by Mov R, M) and use it.
4. Else select memory location X as L
The function getreg returns the location L to hold the value of x for the assignment x := y op z.
1. If the name y is in a register that holds the value of no other names (recall that copy
instructions such as x := y could cause a register to hold the value of two or more variables
simultaneously), and y is not live and has no next use after execution of x := y op z, then return
the register of y for L. Update the address descriptor of y to indicate that y is no longer in L.
3. Failing (2), if x has a next use in the block, or op is an operator such as indexing, that requires
a register, find an occupied register R. Store the value of R into memory location (by MOV R,
M) if it is not already in the proper memory location M, update the address descriptor M, and
return R. If R holds the value of several variables, a MOV instruction must be generated for each
variable that needs to be stored. A suitable occupied register might be one whose datum is
referenced furthest in the future, or one whose value is also in memory.
4. If x is not used in the block, or no suitable occupied register can be found, select the memory
location of x as L.
Example :
Stmt code reg desc addr desc
t2=a-c
t 3 = t 1 + t2
d = t 3 + t2
The code generation algorithm that we discussed would produce the code sequence as shown.
Shown alongside are the values of the register and address descriptors as code generation
progresses.
A DAG for a basic block is a directed cyclic graph with the following labels on nodes:
1. Leaves are labeled by unique identifiers, either variable names or constants. From the
operator applied to a name we determine whether the l-value or r-value of a name is needed;
most leaves represent r- values. The leaves represent initial values of names, and we subscript
them with 0 to avoid confusion with labels denoting "current" values of names as in (3) below.
3. Nodes are also optionally given a sequence of identifiers for labels. The intention is
that interior nodes represent computed values, and the identifiers labeling a node are deemed to
have that value.
For example, the slide shows a three-address code. The corresponding DAG is shown. We
observe that each node of the DAG represents a formula in terms of the leaves, that is, the values
possessed by variables and constants upon entering the block. For example, the node labeled t 4
represents the formula
b[4 * i]
that is, the value of the word whose address is 4*i bytes offset from address b, which is the
intended value of t 4 .
S 1= 4 * i S1=4*i
S2 = addr(A)-4 S 2 = addr(A)-4
S3 = S 2 [S 1 ] S 3 = S2 [S 1 ]
S4=4*i
S5 = addr(B)-4 S 5= addr(B)-4
S 6 = S 5 [S4 ] S6 = S5 [S 4 ]
S7 = S 3 * S6 S 7 = S3 * S 6
S8 = prod+S7
prod = S8 prod = prod + S 7
S9 = I+1
I = S9 I=I+1
If I <= 20 goto (1) If I <= 20 goto (1)
We see how to generate code for a basic block from its DAG representation. The
advantage of doing so is that from a DAG we can more easily see how to rearrange the order of
the final computation sequence than we can starting from a linear sequence of three-address
statements or quadruples. If the DAG is a tree, we can generate code that we can prove is optimal
under such criteria as program length or the fewest number of temporaries used. The algorithm
for optimal code generation from a tree is also useful when the intermediate code is a parse tree.
t1=a+b
t2=c+d
t 3 = e -t 2
X = t 1 -t 3
Rearranging order .
If we generate code for the three-address statements using the code generation algorithm
described before, we get the code sequence as shown (assuming two registers R0 and R1 are
available, and only X is live on exit). On the other hand suppose we rearranged the order of the
statements so that the computation of t 1 occurs immediately before that of X as:
t2 = c + d
t3 = e -t 2
t1 = a + b
X = t 1 -t3
Then, using the code generation algorithm, we get the new code sequence as shown (again only
R0 and R1 are available). By performing the computation in this order, we have been able to
save two instructions; MOV R0, t 1 (which stores the value of R0 in memory location t 1 ) and
MOV t 1 , R1 (which reloads the value of t 1 in the register R1).
IMPORTANT & EXPECTED QUESTIONS:
1. What is Object code? Explain about the following object code forms:
(a) Absolute machine-language
(b) Relocatable machine-language
(c) Assembly-language.
2. Explain about Generic code generation algorithm?
3. Write and explain about object code forms?
4. Explain Peephole Optimization
ASSIGNMENT QUESTIONS: