Data Flow 2

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 29

Data Flow Analysis 2

15-411 Compiler Design

Nov. 3, 2005
Recall: Data Flow Analysis
• A framework for proving facts about program

• Reasons about lots of little facts

• Little or no interaction between facts


 Works best on properties about how program computes

• Based on all paths through program


 including infeasible paths
Recall: Data Flow Equations
Let s be a statement:

• succ(s) = {immediate successor statements of s}

• Pred(s) = {immediate predecessor statements of s}

• In(s) = flow at program point just before executing s

• Out(s) = flow at program point just after executing s

• In(s) =  s’ 2 pred(s) Out(s’) (Must)

• Out(s) = Gen(s) [ (In(s) – Kill(s)) (Forward)

• Note these are also called transfer functions

Gen(s) = set of facts true after s that weren’t true before s


Kill(s) = set of facts no longer true after s
Data Flow Questions

Will it eventually terminate?

How efficient is data flow analysis?

How accurate is the result?


Data Flow Facts and lattices

Typically, data flow facts form a lattice

Example, Available expressions

“top”

“bottom”
Partial Orders

•A partial order is a pair (P, ·) such that


·µ P £ P
 · is reflexive: x · x
 · is anti-symmetric: x · y and y · x implies x = y
 · is transitive: x · y and y · z implies x · z
Lattices
• A partial order is a lattice if u and t are defined so that

 u is the meet or greatest lower bound operation


 x u y · x and x u y · y

 If z · x and z · y then z · x u y

 t is the join or least upper bound operation


 x · x t y and y · x t y

 If x · z and y · z, then x t y · z
Lattices (cont.)
A finite partial order is a lattice if meet and join exist for
every pair of elements

A lattice has unique elements bot and top such that

xu?=? x t ? =x

xu>=x xt>=>

In a lattice

x · y iff x u y = x

x · y iff x t y = y
Useful Lattices
• (2S , µ) forms a lattice for any set S.
 2S is the powerset of S (set of all subsets)

• If (S, ·) is a lattice, so is (S,¸)


 i.e., lattices can be flipped

• The lattice for constant propagation

>
Note: order on integers is
… different from order in lattice
1 2 3

?
Forward Must Data Flow Algorithm
Out(s) = Gen(s) for all statements s

W = {all statements} (worklist)

Repeat

Take s from W

In(s) =  s’ 2 pred(s) Out(s’)

Temp = Gen(s) [ (In(s) – Kill(s))

If (temp != Out (s)) {

Out(s) = temp

W = W [ succ(s)

Until W = 
Monotonicity

• A function f on a partial order is monotonic if

x · y implies f(x) · f(y)

• Easy to check that operations to compute In and


Out are monotonic
 In(s) =  s’ 2 pred(s) Out(s’)
 Temp = Gen(s) [ (In(s) – Kill(s))

• Putting the two together


 Temp = fs ( s’ 2 pred(s) Out(s’))
Termination -- Intuition

•We know algorithm terminates because


 The lattice has finite height
 The operations to compute In and Out are
monotonic
 On every iteration we remove a statement
from the worklist and/or move down the
lattice.
Forward Data Flow (General Case)
Out(s) = Top for all statements s

W := { all statements } (worklist)

Repeat

Take s from W
temp := fs(⊓s′ ∊ pred(s) Out(s′)) (fs monotonic transfer fn)
if (temp != Out(s)) {
Out(s) := temp
W := W [ succ(s)
}

until W = ∅
Lattices (P, ≤)
Available expressions
 P = sets of expressions
 S1 ⊓ S2 = S1 ∩ S2
 Top = set of all expressions

Reaching Definitions
 P = set of definitions (assignment
statements)
 S1 ⊓ S2 = S1 [ S2
 Top = empty set
Fixpoints -- Intuition

We always start with Top


 Every expression is available, no
defns reach this point
 Most optimistic assumption
 Strongest possible hypothesis

Revise as we encounter contradictions


 Always move down in the lattice
(with meet)

Result: A greatest fixpoint


Lattices (P, ≤), cont’d

Live variables
 P = sets of variables
 S1 ⊓ S2 = S1 [ S2
 Top = empty set

Very busy expressions


 P = set of expressions
 S1 ⊓ S2 = S1 ∩ S2
 Top = set of all expressions
Forward vs. Backward
Out(s) = Top for all s In(s) = Top for all s
W := { all statements } W := { all statements }
repeat repeat
Take s from W Take s from W
temp := fs(⊓s′ ∊ pred(s) Out(s′)) temp := fs(⊓s′ ∊ succ(s) In(s′))
if (temp != Out(s)) { if (temp != In(s)) {
Out(s) := temp In(s) := temp
W := W [ succ(s) W := W [ pred(s)
} }
until W = ∅ until W = ∅
Termination Revisited
How many times can we apply this step:
temp := fs(⊓s′ ∊ pred(s) Out(s′))
if (temp != Out(s)) { ... }

Claim: Out(s) only shrinks


 Proof: Out(s) starts out as top
– So temp must be ≤ than Top after first step

 Assume Out(s′) shrinks for all predecessors s′ of s

 Then ⊓s′ ∊ pred(s) Out(s′) shrinks

 Since fs monotonic, fs(⊓s′ ∊ pred(s) Out(s′)) shrinks


Termination Revisited (cont’d)
A descending chain in a lattice is a sequence
 x0 ⊒ x1 ⊒ x2 ⊒ ...

The height of a lattice is the length of the


longest descending chain in the lattice

Then, dataflow must terminate in O(nk) time


 n = # of statements in program
 k = height of lattice
 assumes meet operation takes O(1)
time
Least vs. Greatest Fixpoints
Dataflow tradition: Start with Top, use meet
 To do this, we need a meet semilattice with top
 meet semilattice = meets defined for any set
 Computes greatest fixpoint

Denotational semantics tradition: Start with Bottom, use join


 Computes least fixpoint
Distributive Data Flow Problems

By monotonicity, we also have

A function f is distributive if
Benefit of Distributivity

Joins lose no information


Accuracy of Data Flow Analysis
Ideally, we would like to compute the meet over all paths
(MOP) solution:
 Let fs be the transfer function for statement s
 If p is a path {s1, ..., sn}, let fp = fn;...;f1
 Let path(s) be the set of paths from the entry to s

If a data flow problem is distributive, then solving the data


flow equations in the standard way yields the MOP solution
What Problems are Distributive?

Analyses of how the program computes


 Live variables
 Available expressions
 Reaching definitions
 Very busy expressions

All Gen/Kill problems are distributive


A Non-Distributive Example

Constant propagation

In general, analysis of what the program computes


in not distributive
Order Matters
Assume forward data flow problem
 Let G = (V, E) be the CFG
 Let k be the height of the lattice

If G acyclic, visit in topological order


 Visit head before tail of edge

Running time O(|E|)


 No matter what size the lattice
Order Matters — Cycles
If G has cycles, visit in reverse postorder
 Order from depth-first search

Let Q = max # back edges on cycle-free path


 Nesting depth
 Back edge is from node to ancestor on DFS tree

Then if 8 x. f(x)· x (sufficient, but not necessary)


 Running time is O((Q + 1) |E|)
 Note direction of req’t depends on top vs. bottom
Flow-Sensitivity
Data flow analysis is flow-sensitive
 The order of statements is taken into
account
 i.e., we keep track of facts per program
point

Alternative: Flow-insensitive analysis


 Analysis the same regardless of statement
order
 Standard example: types
Terminology Review
Must vs. May
 (Not always followed in literature)

Forwards vs. Backwards

Flow-sensitive vs. Flow-insensitive

Distributive vs. Non-distributive

You might also like