Static Program Analysis: Part 4 - Flow Sensitive Analyses
Static Program Analysis: Part 4 - Flow Sensitive Analyses
Static Program Analysis: Part 4 - Flow Sensitive Analyses
http://cs.au.dk/~amoeller/spa/
2
Constant propagation optimization
var x,y,z;
x = 27;
y = input,
z = 2*x+y;
if (x<0) { y=z-3; } else { y=12 }
output y;
-3 -2 -1 0 1 2 3
4
Constraints for constant propagation
• Essentially as for the Sign analysis…
5
Agenda
6
Liveness analysis
• A variable is live at a program point if its current value
may be read in the remaining execution
7
A lattice for liveness
A powerset lattice of program variables
8
The control flow graph
x = input x > 1 y = x/2 y > 3 x = x-y
z > 0 x = x/2
z = z-1
output x
9
Setting up
• For every CFG node, v, we have a variable ⟦v⟧:
– the set of program variables that are live
at the program point before v
• Auxiliary definition:
JOIN(v) = ⟦w⟧
wsucc(v)
w1
w2
wk
10
Liveness constraints
• For the exit node: vars(E) = variables occurring in E
⟦exit⟧ =
• For conditions and output:
⟦ if (E) ⟧ = ⟦ output E ⟧ = JOIN(v) vars(E)
• For assignments:
⟦ x = E ⟧ = JOIN(v) \ {x} vars(E)
• For variable declarations:
⟦ var x1, ..., xn ⟧ = JOIN(v) \ {x1, ..., xn}
• For all other nodes:
right-hand sides are monotone
⟦v⟧ = JOIN(v) since JOIN is monotone, and …
11
Generated constraints
⟦var x,y,z⟧ = ⟦x=input⟧ \ {x,y,z}
⟦x=input⟧ = ⟦x>1⟧ \ {x}
⟦x>1⟧ = (⟦y=x/2⟧ ⟦output x⟧) {x}
⟦y=x/2⟧ = (⟦y>3⟧ \ {y}) {x}
⟦y>3⟧ = ⟦x=x-y⟧ ⟦z=x-4⟧ {y}
⟦x=x-y⟧ = (⟦z=x-4⟧ \ {x}) {x,y}
⟦z=x-4⟧ = (⟦z>0⟧ \ {z}) {x}
⟦z>0⟧ = ⟦x=x/2⟧ ⟦z=z-1⟧ {z}
⟦x=x/2⟧ = (⟦z=z-1⟧ \ {x}) {x}
⟦z=z-1⟧ = (⟦x>1⟧ \ {z}) {z}
⟦output x⟧ = ⟦exit⟧ {x}
⟦exit⟧ =
12
Least solution
⟦entry⟧ =
⟦var x,y,z⟧ =
⟦z>0⟧ = {x,z}
⟦x=input⟧ =
⟦x=x/2⟧ = {x,z}
⟦x>1⟧ = {x}
⟦z=z-1⟧ = {x,z}
⟦y=x/2⟧ = {x}
⟦output x⟧ = {x}
⟦y>3⟧ = {x,y}
⟦exit⟧ =
⟦x=x-y⟧ = {x,y}
⟦z=x-4⟧ = {x}
13
Optimizations
• Variables y and z are never simultaneously live
they can share the same variable location
• The value assigned in z=z-1 is never read
the assignment can be skipped
var x,yz;
x = input;
while (x>1) {
yz = x/2; • better register allocation
if (yz>3) x = x-yz;
• a few clock cycles saved
yz = x-4;
if (yz>0) x = x/2;
}
output x;
14
Time complexity
(for the naive algorithm)
• With n CFG nodes and k variables:
– the lattice Ln has height kn
– so there are at most kn iterations
• Subsets of Vars (the variables in the program)
can be represented as bitvectors:
– each element has size k
– each , \, = operation takes time O(k)
• Each iteration uses O(n) bitvector operations:
– so each iteration takes time O(kn)
• Total time complexity: O(k2n2)
16
Available expressions analysis
• A (nontrivial) expression is available at a program
point if its current value has already been computed
earlier in the execution
17
A lattice for available expressions
A reverse powerset lattice of nontrivial expressions
var x,y,z,a,b;
z = a+b; L = (P({a+b, a*b, y>a+b, a+1}), )
y = a*b;
while (y > a+b) {
a = a+1;
x = a+b;
}
18
Reverse powerset lattice
the trivial answer
{a+b, a*b} {a+b, y>a+b} {a+b, a+1} {a*b, y>a+b} {a*b, a+1} {y>a+b, a+1}
{a+b, a*b, y>a+b} {a+b, a*b, a+1} {a+b, y>a+b, a+1} {a*b, y>a+b, a+1}
19
The control flow graph
var x,y,z,a,b
z=a+b
y=a*b
y>a+b
a=a+1
x=a+b
20
Setting up
• For every CFG node, v, we have a variable ⟦v⟧:
– the set of expressions that are available
at the program point after v
21
Auxiliary functions
• The function Xx removes all expressions from X
that contain a reference to the variable x
22
Availability constraints
• For the entry node:
⟦entry⟧ =
• For conditions and output:
⟦ if (E) ⟧ = ⟦ output E ⟧ = JOIN(v) exps(E)
• For assignments:
⟦ x = E ⟧ = (JOIN(v) exps(E))x
• For any other node v:
⟦v⟧ = JOIN(v)
23
Generated constraints
⟦entry⟧ =
⟦var x,y,z,a,b⟧ = ⟦entry⟧
⟦z=a+b⟧ = exps(a+b)z
⟦y=a*b⟧ = (⟦z=a+b⟧ exps(a*b))y
⟦y>a+b⟧ = (⟦y=a*b⟧ ⟦x=a+b⟧) exps(y>a+b)
⟦a=a+1⟧ = (⟦y>a+b⟧ exps(a+1))a
⟦x=a+b⟧ = (⟦a=a+1⟧ exps(a+b))x
⟦exit⟧ = ⟦y>a+b⟧
24
Least solution
⟦entry⟧ =
⟦var x,y,z,a,b⟧ =
⟦z=a+b⟧ = {a+b}
⟦y=a*b⟧ = {a+b, a*b}
⟦y>a+b⟧ = {a+b, y>a+b}
⟦a=a+1⟧ =
⟦x=a+b⟧ = {a+b}
⟦exit⟧ = {a+b}
25
Optimizations
• We notice that a+b is available before the loop
• The program can be optimized (slightly):
var x,y,x,a,b,aplusb;
aplusb = a+b;
z = aplusb;
y = a*b;
while (y > aplusb) {
a = a+1;
aplusb = a+b;
x = aplusb;
}
26
Agenda
27
Very busy expressions analysis
• A (nontrivial) expression is very busy if it will definitely
be evaluated before its value changes
28
An example program
var x,a,b;
x = input;
a = x-1;
b = x-2;
while (x > 0) {
output a*b-x;
x = x-1;
}
output a*b;
29
Code hoisting
var x,a,b; var x,a,b,atimesb;
x = input; x = input;
a = x-1; a = x-1;
b = x-2; b = x-2;
while (x > 0) { atimesb = a*b;
output a*b-x; while (x > 0) {
x = x-1; output atimesb-x;
} x = x-1;
output a*b; }
output atimesb;
30
Setting up
• For every CFG node, v, we have a variable ⟦v⟧:
– the set of expressions that are very busy
at the program point before v
• Auxiliary definition:
JOIN(v) = ⟦w⟧
wsucc(v)
w1
w2
wk
31
Very busy constraints
• For the exit node:
⟦exit⟧ =
• For conditions and output:
⟦ if (E) ⟧ = ⟦ output E ⟧ = JOIN(v) exps(E)
• For assignments:
⟦ x = E ⟧ = JOIN(v)x exps(E)
• For all other nodes:
⟦v⟧ = JOIN(v)
32
Agenda
33
Reaching definitions analysis
• The reaching definitions for a program point are
those assignments that may define the current
values of variables
34
A lattice for reaching definitions
The powerset lattice of assignments
L = (P({x=input, y=x/2, x=x-y, z=x-4, x=x/2, z=z-1}),)
var x,y,z;
x = input;
while (x > 1) {
y = x/2;
if (y>3) x = x-y;
z = x-4;
if (z>0) x = x/2;
z = z-1;
}
output x;
35
Reaching definitions constraints
• For assignments:
⟦ x = E ⟧ = JOIN(v)x { x = E }
• For all other nodes:
⟦v⟧ = JOIN(v) w2
wk
w1
• Auxiliary definition:
JOIN(v) = ⟦w⟧
wpred(v)
v
x=input
x=x-y z=z-1
x>1
37
Forward vs. backward
• A forward analysis:
– computes information about the past behavior
– examples: available expressions, reaching definitions
• A backward analysis:
– computes information about the future behavior
– examples: liveness, very busy expressions
38
May vs. must
• A may analysis:
– describes information that is possibly true
– an over-approximation
– examples: liveness, reaching definitions
• A must analysis:
– describes information that is definitely true
– an under-approximation
– examples: available expressions, very busy expressions
39
Classifying analyses
forward backward
example: reaching definitions example: liveness
40
Agenda
41
Initialized variables analysis
• Compute for each program point those variables
that have definitely been initialized in the past
• (Called definite assignment analysis in Java and C#)
• forward must analysis
• Reverse powerset lattice of all variables
JOIN(v) = ⟦w⟧
wpred(v)