Introduction To Data Flow Analysis
Introduction To Data Flow Analysis
I. Introduction II. Example: Reaching definition analysis III.Example: Liveness analysis IV. A General Framework
(Theory in next lecture)
I. Compiler Organization
Flow Graph
Basic block = a maximal sequence of consecutive instructions s.t. flow of control only enters at the beginning flow of control can only leave at the end (no halting or branching except perhaps at end of block) Flow Graphs Nodes: basic blocks Edges
Statically: Finite program Dynamically: Can have infinitely many possible execution paths Data flow analysis abstraction:
For each point in the program: combines information of all the instances of the same program point.
Example of a data flow question: Which definition defines the value used in statement b = a?
Carnegie Mellon CS243: Intro to Data Flow 5 M. Lam
Reaching Definitions
B0 d0: y = 3
d1: x = 10 d2: y = 11 if e
B1 d3: x = 1
d4: y = 2
d5: z = x d6: x = 4
B2
Every assignment is a definition A definition d reaches a point p if there exists path from the point immediately following d to p such that d is not killed (overwritten) along that path. Problem statement For each point in the program, determine if each definition in the program reaches the point A bit vector per program point, vector-length = #defs
Carnegie Mellon CS243: Intro to Data Flow 6 M. Lam
Build a flow graph (nodes = basic blocks, edges = control flow) Set up a set of equations between in[b] and out[b] for all basic blocks b Effect of code in basic block: Effect of flow of control:
Transfer function fb relates in[b] and out[b], for same b relates out[b1], in[b2] if b1 and b2 are adjacent
Carnegie Mellon CS243: Intro to Data Flow 7 M. Lam
Effects of a Statement
in[B0]
d0: y = 3 d1: x = 10 d2: y = 11
out[B0]
fs : A transfer function of a statement abstracts the execution with respect to the problem of interest For a statement s (d: x = y + z) out[s] = fs(in[s]) = Gen[s] U (in[s]-Kill[s]) Gen[s]: definitions generated: Gen[s] = {d} Propagated definitions: in[s] - Kill[s], where Kill[s]=set of all other defs to x in the rest of program
Carnegie Mellon CS243: Intro to Data Flow 8 M. Lam
out[B0]
Transfer function of a statement s: out[s] = fs(in[s]) = Gen[s] U (in[s]-Kill[s]) Transfer function of a basic block B: Composition of transfer functions of statements in B out[B] = fB(in[B]) = fd1fd0(in[B]) = Gen[d1] U (Gen[d0] U (in[B]-Kill[d0]))-Kill[d1]) = (Gen[d1] U (Gen[d0] - Kill[d1])) U in[B] - (Kill[d0] U Kill[d1]) = Gen[B] U (in[B] - Kill[B]) Gen[B]: locally exposed definitions (available at end of bb) Kill[B]: set of definitions killed by B
CS243: Intro to Data Flow 9 Carnegie Mellon M. Lam
Join node: a node with multiple predecessors meet operator (): U in[b] = out[p1] U out[p2] U ... U out[pn], where p1, ..., pn are all predecessors of b
Carnegie Mellon CS243: Intro to Data Flow 10 M. Lam
Cyclic Graphs
Equations still hold out[b] = fb(in[b]) in[b] = out[p1] U out[p2] U ... U out[pn], p1, ..., pn pred. Find: fixed point solution
Carnegie Mellon CS243: Intro to Data Flow 11 M. Lam
Otherwise, the variable is dead. Problem statement For each basic block
determine if each variable is live in each basic block
Example
IV. Framework
Reaching Definitions Domain Direction Sets of definitions forward: out[b] = fb(in[b]) in[b] = out[pred(b)] fb(x) = Genb (x Killb) out[entry] = out[b] = Live Variables Sets of variables backward: in[b] = fb(out[b]) out[b] = in[succ(b)] fb(x) = Useb (x -Defb) in[exit] = in[b] =
10
11