Dataflow Handout
Dataflow Handout
Compiler Design
CSE 504
1 2 3 4
Preliminaries
Program Analysis
The compiler needs to understand properties of a program (e.g. the set of variables live at a program point). This information should be computed at compile time, with incomplete information on the values the program computes, and without executing the program itself! This information is likely to be approximate: in general, at compile time, we will not know which sequence of instructions will be executed. Data-Flow Analysis is a standard way to formulate intra-procedural program analysis.
Compiler Design
Data-Flow Analysis
CSE 504
2 / 20
Preliminaries
Compiler Design
Data-Flow Analysis
CSE 504
3 / 20
Preliminaries
Example of CFGs
B1: B2: B3: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. i = 1 j = 1 t1 = 10 * i t2 = t1 + j t3 = 4 * t2 a[t3] = 0 j = j + 1 if j < 10 goto (3) i = i + 1 if i < 10 goto (2) i = 1 t4 = 10 * i t5 = t4 + i a[t5] = 1 i = i + 1 if i < 10 goto (12)
Branches only at the end of a block. Branch destinations only at beginning of a block.
B4:
B5: B6:
Exit:
Compiler Design Data-Flow Analysis CSE 504 4 / 20
Live Variables
Live Variables
Consider the problem of nding the set of live variables at some program point. A variable is live after a statement s in the program, if it is used in a statement s , and there is a control ow path from s to s . Example:
1. 2. 3. 4. 5. 6. 7. . . . i = 1 j = 1 t1 = 10 * i t2 = t1 + j t3 = 4 * t2 a[t3] = 0 j = j + 1 . . . Variable t3 is live after statement 5 since it is used in statement 6. Variable j is also live after statement 5 since it is used in statement 7.
Compiler Design
Data-Flow Analysis
CSE 504
5 / 20
Live Variables
Let def (s ) be the set of all variables dened by statement s (e.g. the lhs variable in an assignment statement). Let use (s ) be the set of all variables used by statement s (e.g. the variables on the rhs of an assignment statement). succ (s ): the set of statements that immediately follow statement s . The above denitions for def , use , and succ can be extended for whole blocks as well.
def (B ): set of variables dened in block b . use (B ): set of variables used, but not dened earlier, in block b . succ (B ): set of blocks that immediately succeed block B .
Compiler Design
Data-Flow Analysis
CSE 504
6 / 20
Live Variables
Block 1 2 3 4 5 6
Compiler Design
Data-Flow Analysis
CSE 504
7 / 20
Live Variables
In(t )
Live Variables
In(s ) = Out (s ) =
Let a be a variable that is needed after the procedure exits (e.g. it is a global variable). Then, In(Exit ) = {a}.
Block 1 2 3 4 5 6 Exit Succ {2} {3} {3,4} {2,5} {6} {6,Exit} {} Def {i} {j} {t1,t2,t3,j} {i} {i} {t4,t5,i} {} Use {} {} {a,i, j} {i} {} {a,i} {} In Out(1){i} Out(2){j} {a,i,j} Out(4){t1,t2,t3,j} {i} Out(4){i} Out(5){i} {a,i} Out(6){t4,t5,i} {a} Out In(2) In(3) In(3) In(4) In(2) In(5) In(6) In(6) In(Exit)
Compiler Design
Data-Flow Analysis
CSE 504
9 / 20
Live Variables
In(6) = Out (6) = {a, i }, and In(Exit ) = {a}. In(6) = Out (6) = {a, i , t 3}, and In(Exit ) = {a}. . . .
Of these, (1) is the least. In fact, it can be shown that every solution will contain (1).
Compiler Design Data-Flow Analysis CSE 504 10 / 20
Data ow analysis is formulated in terms of nding the least (or sometimes, the greatest) solution to a set of simultaneous equations. The ow equations can be written as X = F (X ), where X is a vector of Ins and Outs. Solutions X such that X = F (X ) are xed points of F . The smallest X such that X = F (X ) is called the least xed point of F.
Compiler Design
Data-Flow Analysis
CSE 504
11 / 20
Partial Orders
Let U be a nite set, and let D = P (U ), i.e. the powerset of U . Let Dn = D D (n times) D, i.e., an n-dimensional cartesian space over P (U ). We can dene partial order among vectors of sets such that X if, and only if, for all components of the vector, Xi Xi . X
(Dn , ) is a complete lattice with as the least element and the greatest element. Vectors X , X , . . . , X (0) (1) (i ) X X X .
(0) (1) (i )
is called a chain if
Monotone Functions
Let F : Dn Dn (i.e. a function from Dn to Dn ). A function F is monotone over partial order if, for every X and X such that X X , we have F (X ) F (X ).
Note the denition of monotonicity. It says the function returns smaller values if it is given smaller argument values. It is not necessary that the returned values must be smaller than the argument values!
It is easy to see that the ow equations for live variable analysis denes a monotone function. There is a simple way to show the existence of xed points, and to compute the Least/Greatest Fixed Points of a monotone function. Tarski-Knaster Theorem: Given a complete lattice L and a function G : L L, the xed points of G form a complete lattice. Consequently, there exist both least and greatest xed points.
Compiler Design Data-Flow Analysis CSE 504 13 / 20
(0)
(1)
,...,X
(i )
, . . ., where X = and
(i +2)
.
F (X
(i +1)
).
Since all chains over are nite, consider the last element of the (n) chain X .
X
(n)
= F (X
(n )
(n )
So, X
Compiler Design
is a xed point of F .
Data-Flow Analysis CSE 504 14 / 20
,X
(1)
,...,X
(i )
,...,X
(n)
, where X =
F (Y ) = Y
Hence, by induction, for all elements of the chain X Y. (n ) In particular, X Y , is at least as small as any xed point Y of F , and hence is the least xed point.
Compiler Design
Data-Flow Analysis
CSE 504
15 / 20
(0)
,X
(1)
,...,X
(i )
,...,X
(n)
, where X =
Note the starting point of this sequence: the greatest element in the lattice. By an argument similar to the one we used for the least xed point, (n) X can be shown to be the greatest xed point of F .
Compiler Design
Data-Flow Analysis
CSE 504
16 / 20
Compiler Design
Data-Flow Analysis
CSE 504
17 / 20
Other Analyses
Reaching Denitions
An assignment of the form x = e for some expression e is said to dene x . A denition at statement s1 reaches another statement s2 if:
there is some control ow path from s1 to s2 , such that there is no other denition of x on the path from s1 to s2 .
Let In(s ) be the set of all denitions that reach s . Let Out (s ) be the set of all denitions that reach all the immediate successors of s . Then Out (s ) = gen(s ) (In(s ) kill (s )), where
gen(s ) is the set of denitions generated by s , and kill (s )) is the set of denitions with the same lhs variables as those in s .
In(s ) =
t pred (s ) Out (t )
Compiler Design
Data-Flow Analysis
CSE 504
18 / 20
Other Analyses
In(t )
Reaching Denitions: In and Out are the smallest sets such that In(s ) =
t pred (s )
Out (t )
Out (s ) = gen(s ) (In(s ) kill (s )) The form of equations is identical, and they can be computed using the same procedure, except:
Live Variables are best computed backwards through the ow graph (information goes from successors to predecessors). Reaching Denitions are best computed forwards through the ow graph (information goes from predecessors to successors).
Compiler Design Data-Flow Analysis CSE 504 19 / 20
Other Analyses
Available Expressions
An expression e is available at statement s if, for every path that reaches s1 , there is some statement s where e is evaluated. Let In(s ) be the set of all expressions available immediately before s is evaluated. Let Out (s ) be the set of all expressions available immediately after s is evaluated. Then Out (s ) = gen(s ) (In(s ) kill (s )), where
gen(s ) is the set of all expressions evaluated in s , and kill (s ) is the set of all expressions that use the lhs variables dened in s .
In(s ) =
t pred (s ) Out (t )
In and Out are the greatest sets that satisfy the above equations.
Compiler Design
Data-Flow Analysis
CSE 504
20 / 20