0% found this document useful (0 votes)

11 views

03 Data Flow Analysis Handout

The document outlines an upcoming lecture on intraprocedural data flow analysis. It will cover static analysis techniques to derive information about how data flows through a program during execution. The lecture will define basic concepts like basic blocks, control flow graphs, and data flow frameworks. It will also discuss how to model the flow of data through transfer functions and solve the resulting data flow equations to compute the values at each program point.

Uploaded by

Bef

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

03 Data Flow Analysis Handout

Uploaded by

Bef

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Agenda

CS738: Advanced Compiler Optimizations

Data Flow Analysis

◮ Static analysis and compile-time optimizations
Amey Karkare ◮ For the next few lectures
[email protected] ◮ Intraprocedural Data Flow Analysis
http://www.cse.iitk.ac.in/~karkare/cs738 ◮ Classical Examples
Department of CSE, IIT Kanpur ◮ Components

Assumptions 3-address Code Format

◮ Assignments
x = y op z
x = op y
◮ Intraprocedural: Restricted to a single function x=y
◮ Jump/control transfer
◮ Input in 3-address format
goto L
◮ Unless otherwise specified if x relop y goto L
◮ Statements can have label(s)
L: . . .
◮ Arrays, Pointers and Functions to be added later when
needed
Data Flow Analysis Data Flow Abstraction

◮ Class of techniques to derive information about flow of

data ◮ Basic Blocks (BB)
◮ along program execution paths ◮ sequence of 3-address code stmts
◮ Used to answer questions such as: ◮ single entry at the first statement
◮ whether two identical expressions evaluate to same value ◮ single exit at the last statement
◮ used in common subexpression elimination
◮ Typically we use “maximal” basic block (maximal sequence
◮ whether the result of an assignment is used later of such instructions)
◮ used by dead code elimination

Identifying Basic Blocks Special Basic Blocks

◮ Two special BBs are added to simplify the analysis

◮ Leader: The first statement of a basic block ◮ empty (?) blocks!
◮ The first instruction of the program (procedure)
◮ Target of a branch (conditional and unconditional goto) ◮ Entry: The first block to be executed for the procedure
◮ Instruction immediately following a branch analyzed
◮ Exit: The last block to be executed
Data Flow Abstraction CFG Edges

◮ Control Flow Graph (CFG) ◮ Edge B1 → B2 ∈ E if control can transfer from B1 to B2

◮ Fall through
◮ A rooted directed graph G = (N, E) ◮ Through jump (goto)
◮ N = set of BBs ◮ Edge from Entry to (all?) real first BB(s)
◮ including Entry, Exit ◮ Edge to Exit from all last BBs
◮ BBs containing return
◮ E = set of edges
◮ Last real BB

Data Flow Abstraction: Control Flow Graph Data Flow Abstraction: Program Points

◮ Graph representation of paths that program may exercise

◮ Input state/Output state for Stmt
during execution
◮ Program point before/after a stmt
◮ Typically one graph per procedure ◮ Denoted IN[s] and OUT[s]
◮ Graphs for separate procedures have to be ◮ Within a basic block:
combined/connected for interprocedural analysis ◮ Program point after a stmt is same as the program point
◮ Later! before the next stmt
◮ Single procedure, single flow graph for now.
Data Flow Abstraction: Program Points Data Flow Abstraction: Execution Paths

◮ An execution path is of the form

◮ Input state/Output state for BBs
p1 , p2 , p3 , . . . , pn
◮ Program point before/after a bb
◮ Denoted IN[B] and OUT[B] where pi → pi+1 are adjacent program points in the CFG.
◮ For B1 and B2 :
◮ if there is an edge from B1 to B2 in CFG, then the program ◮ Infinite number of possible execution paths in practical
point after the last stmt of B1 may be followed immediately by programs.
the program point before the first stmt of B2 . ◮ Paths having no finite upper bound on the length.
◮ Need to summarize the information at a program point with
a finite set of facts.

Data Flow Schema Data Flow Problem

◮ Constraints on data flow values

◮ Transfer constraints
◮ Data flow values associated with each program point ◮ Control flow constraints
◮ Summarize all possible states at that point ◮ Aim: To find a solution to the constraints
◮ Domain: set of all possible data flow values ◮ Multiple solutions possible
◮ Different domains for different analyses/optimizations ◮ Trivial solutions, . . . , Exact solutions
◮ We typically compute approximate solution
◮ Close to the exact solution (as close as possible!)
◮ Why not exact solution?
Data Flow Constraints: Transfer Constraints Data Flow Constraints: Control Flow Constraints

◮ Transfer functions
◮ relationship between the data flow values before and after a
stmt
◮ forward functions: Compute facts after a statement s from ◮ Relationship between the data flow values of two points
the facts available before s. that are related by program execution semantics
◮ General form:
◮ For a basic block having n statements:
OUT[s] = fs (IN[s])
IN[si+1 ] = OUT[si ], i = 1, 2, . . . , n − 1
◮ backward functions: Compute facts before a statement s
from the facts available after s. ◮ IN[s1 ], OUT[sn ] to come later
◮ General form:
IN[s] = fs (OUT[s])

◮ fs depends on the statement and the analysis

Data Flow Constraints: Notations Data Flow Constraints: Basic Blocks

◮ Forward
◮ For B consisting of s1 , s2 , . . . , sn

fB = fsn ◦ . . . ◦ fs2 ◦ fs1

◮ PRED (B): Set of predecessor BBs of block B in CFG OUT[B] = fB (IN[B])

◮ SUCC (B): Set of successor BBs of block B in CFG ◮ Control flow constraints
M
◮ f ◦ g : Composition of functions f and g IN[B] = OUT[P]
L
◮ : An abstract operator denoting some way of combining P∈PRED(B)

facts present in a set .

◮ Backward
f B = fs 1 ◦ fs 2 ◦ . . . ◦ fs n
IN[B] = fB (OUT [B])
M
OUT[B] = IN[S]
S∈SUCC(B)
Data Flow Equations Example Data Flow Analysis

◮ Typical Equation

OUT[s] = IN[s] − kill[s] ∪ gen[s] ◮ Reaching Definitions Analysis

gen(s): information generated ◮ Definition of a variable x: x = . . . something . . .
kill(s): information killed ◮ Could be more complex (e.g. through pointers, references,
◮ Example: implicit)
a = b*c // generates expression b * c
c = 5 // kills expression b*c
d = b*c // is b*c redundant here?

Reaching Definitions Analysis RD Analysis of a Structured Program

IN(s1 )

d :x =y +z s1
◮ A definition d reaches a point p if
◮ there is a path from the point immediately following d to p OUT(s1 )
◮ d is not “killed” along that path
◮ “Kill” means redefinition of the left hand side (x in the earlier
example) OUT(s1 ) = IN(s1 ) − KILL(s1 ) ∪ GEN(s1 )
GEN(s1 ) = {d}
KILL(s1 ) = Dx − {d}, where Dx : set of all definitions of x
KILL(s1 ) = Dx ? will also work here
but may not work in general
RD Analysis of a Structured Program RD Analysis of a Structured Program

IN(S)
IN(S)
S
s1 S

s1 s2
s2
OUT(S)
OUT(S)

GEN(S) = GEN(s1 ) − KILL(s2 ) ∪ GEN(s2 )

GEN(S) = GEN(s1 ) ∪ GEN(s2 )
KILL(S) = KILL(s1 ) − GEN(s2 ) ∪ KILL(s2 )
KILL(S) = KILL(s1 ) ∩ KILL(s2 )
IN(s1 ) = IN(S)
IN(s1 ) = IN(s2 ) = IN(S)
IN(s2 ) = OUT(s1 )
OUT(S) = OUT(s1 ) ∪ OUT(s2 )
OUT(S) = OUT(s2 )

RD Analysis of a Structured Program RD Analysis is Approximate

IN(S)
IN(S) S
S
s1 s2
s1

OUT(S)
OUT(S)
◮ Assumption: All paths are feasible.
◮ Example:
GEN(S) = GEN(s1 )
KILL(S) = KILL(s1 ) if (true) s1;
else s2;
OUT(S) = OUT(s1 )
Fact Computed Actual
IN(s1 ) = IN(S) ∪ GEN(s1 )
GEN(S) = GEN(s1 ) ∪ GEN(s2 ) ⊇ GEN(s1 )
KILL(S) = KILL(s1 ) ∩ KILL(s2 ) ⊆ KILL(s1 )
RD Analysis is Approximate RD at BB level

IN(S)
S ◮ A definition d can reach the start of a block from any of its
predecessor
s1 s2 ◮ if it reaches the end of some predecessor
[
IN(B) = OUT(P)
OUT(S) P∈PRED(B)

◮ Thus, ◮ A definition d reaches the end of a block if

true GEN(S) ⊆ analysis GEN(S) ◮ either it is generated in the block
true KILL(S) ⊇ analysis KILL(S) ◮ or it reaches block and not killed
◮ More definitions computed to be reaching than actually do! OUT(B) = IN(B) − KILL(B) ∪ GEN(B)
◮ Later we shall see that this is SAFE approximation
◮ prevents optimizations
◮ but NO wrong optimization

Solving RD Constraints

for each block B {

OUT(B) = ∅;
◮ KILL & GEN known for each BB. }
OUT(Entry ) = ∅; // note this for later discussion
◮ A program with N BBs has 2N equations with 2N change = true;
unknowns. while (change) {
◮ Solution is possible. change = false;
◮ Iterative approach (on the next slide). for each block B other than Entry {
S
IN(B) = P∈PRED(B) OUT(P);
oldOut = OUT(B);
OUT(B) = IN(B) − KILL(B) ∪ GEN(B);
if (OUT(B) 6=oldOut) then {
change = true;
}
}
}
Reaching Definitions: Example Reaching Definitions: Example
Pass# Pt B1 B2 B3 B4
Init IN - - - -
OUT ∅ ∅ ∅ ∅
1 IN ∅ d1, d2, d3, d3,
d3 d4, d5 d4,
d5, d6
OUT d1, d3, d4, d4, d3,
BB GEN KILL d2, d3 d5 d5, d6 d5,
d6, d7
B1 {d1, d2, d3} {d4, d5, d6, d7}
2 IN ∅ d1, d2, d3, d3,
B2 {d4, d5} {d1, d2, d7} d3, d5, d4, d4,
B3 {d6} {d3} d6, d7 d5, d6 d5, d6
OUT d1, d3, d4, d4, d3,
B4 {d7} {d1, d4} d2, d3 d5, d6 d5, d6 d5,
d6, d7
3 IN ∅ d1, d2, d3, d3,
d3, d5, d4, d4,
d6, d7 d5, d6 d5, d6
OUT d1, d3, d4, d4, d3,
d2, d3 d5, d6 d5, d6 d5,
d6, d7

Reaching Definitions: Bitvectors Reaching Definitions: Bitvectors

◮ Set-theoretic definitions:
[
a bit for each definition: IN(B) = OUT(P)
d1 d2 d3 d4 d5 d6 d7 P∈PRED(B)
Pass# Pt B1 B2 B3 B4
Init IN - - - - OUT(B) = IN(B) − KILL(B) ∪ GEN(B)
OUT 0000000 0000000 0000000 0000000
1 IN 0000000 1110000 0011100 0011110 ◮ Bitvector definitions:
OUT 1110000 0011100 0001110 0010111
_
2 IN 0000000 1110111 0011110 0011110
OUT 1110000 0011110 0001110 0010111
IN(B) = OUT(P)
3 IN 0000000 1110111 0011110 0011110 P∈PRED(B)
OUT 1110000 0011110 0001110 0010111
OUT(B) = IN(B) ∧ ¬KILL(B) ∨ GEN(B)
◮ Bitwise ∨, ∧, ¬ operators
Reaching Definitions: Application Reaching Definitions: Application

Constant Folding
◮ Recall the approximation in reaching definition analysis
while changes occur { true GEN(S) ⊆ analysis GEN(S)
forall the stmts S of the program { true KILL(S) ⊇ analysis KILL(S)
foreach operand B of S { ◮ Can it cause the application to infer
if there is a unique definition of B ◮ an expression as a constant when it is has different values
that reaches S and is a constant C { for different executions?
replace B by C in S; ◮ an expression as not a constant when it is a constant for all
if all operands of S are constant { executions?
replace rhs by eval(rhs); ◮ Safety? Profitability?
mark definition as constant;
}}}}}

Reaching Definitions: Summary Reaching Definitions: Summary

dx in B defines variable x and is not ◮ Entry block has to be initialized specially:
◮ GEN(B) = dx
followed by another definition of x in B
◮ KILL(B) = {dx | B contains some definition of x } OUT(Entry ) = EntryInfo
S EntryInfo = ∅
◮ IN(B) = P∈PRED(B) OUT(P)
◮ OUT(B) = IN(B) − KILL(B) ∪ GEN(B) ◮ A better entry info could be:
V
◮ meet ( ) operator: The operator to combine information
coming along different predecessors is ∪ EntryInfo = {x = undefined | x is a variable}
◮ What about the Entry block? ◮ Why?

Fresco Code Python Application Programming
90% (20)
Fresco Code Python Application Programming
7 pages
Protocol Standardization For Iot: Unit-Ii
40% (5)
Protocol Standardization For Iot: Unit-Ii
42 pages
Unit 4 Notes PDF
100% (2)
Unit 4 Notes PDF
27 pages
04 Avp 2015
No ratings yet
04 Avp 2015
29 pages
Note 3
No ratings yet
Note 3
40 pages
Introduction To Data Flow Analysis
No ratings yet
Introduction To Data Flow Analysis
11 pages
Class Data Flow Analysis
No ratings yet
Class Data Flow Analysis
44 pages
L6 Foundations of Dataflow
No ratings yet
L6 Foundations of Dataflow
7 pages
Iterative Data Flow Analysis
No ratings yet
Iterative Data Flow Analysis
88 pages
Program-Analysis-ThuTrangNguyen-Day-2
No ratings yet
Program-Analysis-ThuTrangNguyen-Day-2
108 pages
Dataflow Handout
No ratings yet
Dataflow Handout
10 pages
04 Data Flow Analysis Handout
No ratings yet
04 Data Flow Analysis Handout
8 pages
A Brief Odyssey of Dataflow Analysis in Optimizing Compilers
No ratings yet
A Brief Odyssey of Dataflow Analysis in Optimizing Compilers
20 pages
Data-Flow Analysis - Part 2: Y.N. Srikant
No ratings yet
Data-Flow Analysis - Part 2: Y.N. Srikant
26 pages
Dfa Part 2 PDF
No ratings yet
Dfa Part 2 PDF
26 pages
Data Flow Analysis: CS 201 Compiler Construction
No ratings yet
Data Flow Analysis: CS 201 Compiler Construction
16 pages
Reaching Definition
No ratings yet
Reaching Definition
18 pages
Code Optimization IV
No ratings yet
Code Optimization IV
12 pages
32 Intro To Optimizations 2
No ratings yet
32 Intro To Optimizations 2
21 pages
Unit-5-2
No ratings yet
Unit-5-2
48 pages
Data Flow 2
No ratings yet
Data Flow 2
29 pages
Cdunit 6
No ratings yet
Cdunit 6
20 pages
A Survey of Static Program Analysis Techniques
No ratings yet
A Survey of Static Program Analysis Techniques
16 pages
DFA Sample
No ratings yet
DFA Sample
6 pages
L14 Dataflow
No ratings yet
L14 Dataflow
53 pages
15Cs314J - Compiler Design: Unit 5
No ratings yet
15Cs314J - Compiler Design: Unit 5
36 pages
Data Flow Analysis: Goal: This Information Is Used in Various Optimizations
No ratings yet
Data Flow Analysis: Goal: This Information Is Used in Various Optimizations
28 pages
Basics of Data Flow Testing
100% (1)
Basics of Data Flow Testing
13 pages
Chapter 4 Part II
No ratings yet
Chapter 4 Part II
32 pages
Unit 4
No ratings yet
Unit 4
15 pages
STM Unit2 2
No ratings yet
STM Unit2 2
53 pages
Reaching Definitions and U D Chaining
No ratings yet
Reaching Definitions and U D Chaining
15 pages
Data-Flow Analysis - Part 1: Y.N. Srikant
No ratings yet
Data-Flow Analysis - Part 1: Y.N. Srikant
13 pages
Static Program Analysis: Part 4 - Flow Sensitive Analyses
No ratings yet
Static Program Analysis: Part 4 - Flow Sensitive Analyses
42 pages
Chapter - 5: Data Flow Testing
No ratings yet
Chapter - 5: Data Flow Testing
40 pages
Optimization PDF
No ratings yet
Optimization PDF
40 pages
Code Optimization PPT
No ratings yet
Code Optimization PPT
32 pages
Basic Block Optimization
No ratings yet
Basic Block Optimization
33 pages
Optimization Techniques Code Optimizations
No ratings yet
Optimization Techniques Code Optimizations
10 pages
Unit-V Control /data Flow Analysis
No ratings yet
Unit-V Control /data Flow Analysis
18 pages
Compiler Optimizations Presentation
No ratings yet
Compiler Optimizations Presentation
105 pages
STM Unit3 DFG
No ratings yet
STM Unit3 DFG
51 pages
4 Data-Testing PDF
No ratings yet
4 Data-Testing PDF
79 pages
Unit 8 Code Optimization and Generation
No ratings yet
Unit 8 Code Optimization and Generation
10 pages
Data Flow Testing-UNIT2 - Part2
No ratings yet
Data Flow Testing-UNIT2 - Part2
49 pages
Unit 4 CD
No ratings yet
Unit 4 CD
21 pages
Data Flow Analysis
No ratings yet
Data Flow Analysis
18 pages
Program Analysis
No ratings yet
Program Analysis
73 pages
11.intermediate Code Generation Quadruple, Triple, Indirect Triple
No ratings yet
11.intermediate Code Generation Quadruple, Triple, Indirect Triple
27 pages
Static Program Analysis: Anders Møller and Michael I. Schwartzbach
No ratings yet
Static Program Analysis: Anders Møller and Michael I. Schwartzbach
82 pages
CD Unit 5
No ratings yet
CD Unit 5
12 pages
25-Automatic Static Analysis-14!03!2024
No ratings yet
25-Automatic Static Analysis-14!03!2024
185 pages
Ch2bac 1 PDF
No ratings yet
Ch2bac 1 PDF
41 pages
Chapter 15 Information Flow
No ratings yet
Chapter 15 Information Flow
9 pages
High Level Synthesis - 02 - Basic Concepts
No ratings yet
High Level Synthesis - 02 - Basic Concepts
27 pages
13 Modelling Programs 25-01-2025
No ratings yet
13 Modelling Programs 25-01-2025
43 pages
Available Expression Analysis
No ratings yet
Available Expression Analysis
9 pages
Code Generation and Optimization
No ratings yet
Code Generation and Optimization
37 pages
Compiler Design
No ratings yet
Compiler Design
25 pages
Software Testing Techniques
No ratings yet
Software Testing Techniques
40 pages
13 Static Analysis Classic Problems
No ratings yet
13 Static Analysis Classic Problems
57 pages
Chapter-5-Data Flow Models - Testing
No ratings yet
Chapter-5-Data Flow Models - Testing
52 pages
Jambor
No ratings yet
Jambor
8 pages
Compiler Optimizations1
No ratings yet
Compiler Optimizations1
43 pages
All Questions
No ratings yet
All Questions
25 pages
Nazgul Resume 16 November
No ratings yet
Nazgul Resume 16 November
2 pages
Spec HKM0127 - Orbit Star N1
No ratings yet
Spec HKM0127 - Orbit Star N1
4 pages
Informatica Functions Explanation: IIF: Syntax
No ratings yet
Informatica Functions Explanation: IIF: Syntax
4 pages
Hospital Managemen T System: Oose LAB File
No ratings yet
Hospital Managemen T System: Oose LAB File
62 pages
Visual C++ 2008 Tutorial
No ratings yet
Visual C++ 2008 Tutorial
12 pages
Production Planning Setup in Datatex - NOW
No ratings yet
Production Planning Setup in Datatex - NOW
28 pages
Compilation Chapter 13-Data Warehouse-Ans
No ratings yet
Compilation Chapter 13-Data Warehouse-Ans
22 pages
Ankit Soni: Education Personal Info
No ratings yet
Ankit Soni: Education Personal Info
1 page
Android Step by Step
No ratings yet
Android Step by Step
25 pages
Distributed systems Chapter 1-Introduction
No ratings yet
Distributed systems Chapter 1-Introduction
34 pages
MW2 RSE v35 Installation READ ME
No ratings yet
MW2 RSE v35 Installation READ ME
7 pages
Service Manual Acer Travel Mate 4210 4270 4670 Aspire 5620 5670
No ratings yet
Service Manual Acer Travel Mate 4210 4270 4670 Aspire 5620 5670
132 pages
Gsa H10N.H10L - QSG 1008S PDF
No ratings yet
Gsa H10N.H10L - QSG 1008S PDF
4 pages
Dkg-547 Automatic Mains Failure Unit With J1939 Interface: Description
No ratings yet
Dkg-547 Automatic Mains Failure Unit With J1939 Interface: Description
2 pages
C_Structures_Unions_Exercises 2
No ratings yet
C_Structures_Unions_Exercises 2
3 pages
HSRM Leitfaden Thesis
100% (1)
HSRM Leitfaden Thesis
5 pages
Fast Lane - PA-EDU-210
No ratings yet
Fast Lane - PA-EDU-210
3 pages
Muratec MFX-C3400 User Manual
No ratings yet
Muratec MFX-C3400 User Manual
188 pages
Project 619612 EPP 1 2020 1 FI EPPKA1 JMD MOB
No ratings yet
Project 619612 EPP 1 2020 1 FI EPPKA1 JMD MOB
2 pages
Transaction Management
No ratings yet
Transaction Management
33 pages
Omega Manual French
No ratings yet
Omega Manual French
23 pages
JBL Ha Bar 1000 Om v9 en
No ratings yet
JBL Ha Bar 1000 Om v9 en
29 pages
Pronto Xi Help 750.2 - Maintain Territory Group Code Table
No ratings yet
Pronto Xi Help 750.2 - Maintain Territory Group Code Table
1 page
Les 1.3
No ratings yet
Les 1.3
4 pages
PCBA Machine
No ratings yet
PCBA Machine
62 pages
Elektor en Article Easyavr5a Serial Ethernet Basic
No ratings yet
Elektor en Article Easyavr5a Serial Ethernet Basic
2 pages
DCP 100 Direct Card Printer
No ratings yet
DCP 100 Direct Card Printer
2 pages
Xdata User Guide
No ratings yet
Xdata User Guide
198 pages

03 Data Flow Analysis Handout

Uploaded by

03 Data Flow Analysis Handout

Uploaded by

Agenda

CS738: Advanced Compiler Optimizations

Data Flow Analysis

Assumptions 3-address Code Format

◮ Class of techniques to derive information about flow of

Identifying Basic Blocks Special Basic Blocks

◮ Two special BBs are added to simplify the analysis

◮ Control Flow Graph (CFG) ◮ Edge B1 → B2 ∈ E if control can transfer from B1 to B2

◮ Graph representation of paths that program may exercise

◮ An execution path is of the form

Data Flow Schema Data Flow Problem

◮ Constraints on data flow values

◮ fs depends on the statement and the analysis

Data Flow Constraints: Notations Data Flow Constraints: Basic Blocks

fB = fsn ◦ . . . ◦ fs2 ◦ fs1

◮ PRED (B): Set of predecessor BBs of block B in CFG OUT[B] = fB (IN[B])

facts present in a set .

OUT[s] = IN[s] − kill[s] ∪ gen[s] ◮ Reaching Definitions Analysis

Reaching Definitions Analysis RD Analysis of a Structured Program

GEN(S) = GEN(s1 ) − KILL(s2 ) ∪ GEN(s2 )

RD Analysis of a Structured Program RD Analysis is Approximate

◮ Thus, ◮ A definition d reaches the end of a block if

for each block B {

Reaching Definitions: Bitvectors Reaching Definitions: Bitvectors

Reaching Definitions: Summary Reaching Definitions: Summary

You might also like