Week-4 (Lecture-2)
Week-4 (Lecture-2)
Week-4
Reverse Engineering
• Reverse Engineering supports understanding of a system
through identification of the components or artifacts of the
system, discovering relationships between them and
generating abstractions of that information.
1. Data Gathering
2. Knowledge Organization
3. Information Exploration
Reverse Engineering Activities
1. Data Gathering
• Reverse engineering:
• Syntactic analysis is essential for reconstructing a high-level representation of
the code. This representation makes it easier to understand and modify the
code.
• By analyzing the AST or parse tree, reverse engineers can identify control
flow structures, data structures, and relationships between different parts of
the code.
2. Graphing methods
• There are a variety of graphing approaches for program understanding.
• These include, in increasing order of complexity and richness:
• graphing the control flow of the program,
• the data flow of the program ,
• and program dependence graphs.
• Static analysis of code a program is the analysis of the code without regard to
its execution or input.
• Control flow analysis; what pieces of the code would be executed and in what sequence
• Data flow analysis; how does information flow within a program and across programs
Control Flow – Introduction
• Control Flow
• Used to identify the possible paths through the program
• The flow is represented as a directed graph with splits and joins
• Identify loops
Intraprocedural analysis:
• The idea of basic blocks is central to constructing a CFG.
• A basic block is a maximal sequence of program statements such that execution
enters at the top of the block and leaves only at the bottom via a conditional or an
unconditional branch statement.
• A basic block is represented with one node in the CFG, and an arc indicates possible
flow of control from one node to another.
Control Flow Analysis
Interprocedural analysis:
• Interprocedural analysis is performed by constructing a call graph.
• Call graphs can be static or dynamic. A dynamic call graph is an execution trace of
the program.
• Thus, a dynamic call graph is exact, but it only describes one run of the program.
• On the other hand, a static call graph represents every possible run of the program.
Control Flow Analysis
• An approach that avoids the burden of annotations, and can capture
what a procedure actually does as used in a particular program, is
building a control flow graph for the entire program, rather than just
one procedure.
• We add additional edges to the control flow graph. For every call to function g,
we add an edge from the call site to the first instruction of g, and from every
return statement of g to the instruction following that call.
Control Flow Graph
Control Flow Graph
Control Flow Graph
Control Flow Graph
Flow Graphs of various blocks
Flow Graphs of various blocks
Control Flow – Example
Control Flow – Code View
• Another example of visualizing the control flow of a program is using a Control
Structure Diagram (CSD).
• It automatically documents the program flow within the source code and adds
indentation with graphical symbols
• Sequence
• Selection
• Iteration
• Exception Handling
CSD Control Constructs
CSD Control Constructs
CSD Control Constructs
CSD Control Constructs
CSD Control Constructs
Data Flow Graph: Data Analysis
• All control edges together form a graph called the Control Flow Graph
(CFG).
• All data edges together form a graph called the Data Flow Graph (DFG).
• A DFD shows what kind of information will be
// a b r
• int max(int a, int b) { // 1 W W
• if (a > b) // 2 R R
• r = a; // 3 R W
• else
• r = b; // 4 R W
• return r; // 5 R
Example
• Next, we draw each data dependency.
• A data dependency goes from a node that writes into a variable to
another node that reads from the variable.
• To have a valid dependency, we must identify the correct ‘write’ node
for each ‘read’ node. That is done as follows.
• Start with a node that reads from a variable. For example, node 3 in the
example reads variable a. That read operation is the endpoint of a data
dependency.
• Next, walk backward in the control flow graph until you find a node that
writes the same variable. That is the starting point of a data dependency. For
example, going backward from node 3, we visit node 2, and then node 1. Only
node 1 writes a. Therefore, the data dependency for a goes from 1 to 3.
Example
• Data Flow graph
Class Activity
Definition-Use Pairs
• A def-use (du) pair associates a point in a program where a value is
produced with a point where it is used
• If, instead, another definition is present on the path, then the latter
definition kills the former