0% found this document useful (0 votes)

6 views27 pages

Lecture 7

The document discusses techniques used for reverse engineering in software, focusing on lexical and syntactic analysis as well as program slicing. Lexical analysis breaks down source code into tokens, while syntactic analysis checks the grammatical correctness of code. Program slicing identifies relevant statements affecting specific variables, aiding in debugging and understanding code behavior.

Uploaded by

Shizra Ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views27 pages

Lecture 7

Uploaded by

Shizra Ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

SOFTWARE RE-ENGINEERING

SE-409
TECHNIQUES USED FOR REVERSE
ENGINEERING
• Fact-finding and information gathering from the source code are the
keys to the Goal/Models/Tools paradigm, which partitions a process
for reverse engineering into three ordered stages: Goals, Models, and
Tools.
• In order to extract information which is not clearly available in source
code, automated analysis techniques are used.
• The well-known analysis techniques that facilitate reverse engineering
are lexical analysis, syntactic analysis, data flow analysis, program
slicing, visualization, etc.
Lexical Analysis

• Lexical analysis is the process of decomposing the sequence of

characters in the source code into its constituent lexical units.
• Lexical analysis is the first phase of a compiler. It takes modified
source code from language preprocessors that are written in the
form of sentences.
• The lexical analyzer breaks these syntaxes into a series of tokens, by
removing any whitespace or comments in the source code.
Lexical Analysis
• The main task of lexical analysis is to read input characters in the
code and produce tokens. Lexical analyzer scans the entire source
code of the program. It identifies each token one by one.
• If the lexical analyzer finds a token invalid, it generates an error. The
lexical analyzer works closely with the syntax analyzer.
Lexical Analysis
1. ‘Get next token’ is a command which is sent from the parser to the
lexical analyzer.
2. On receiving this command, the ‘lexical analyzer’ scans the input until it
finds the next token.
3. It returns the token to ‘Parser’.
• Lexical Analyzer skips whitespaces and comments while creating these
tokens. If any error is present, then Lexical analyzer will correlate that
error with the source file and line number.
• Lexical analyzer is used by web browsers to format and display a web page
with the help of parsed data from JavaScript, HTML, CSS.
Lexical Analysis
• Fundamental terms: Tokens, Patterns, Lexeme.
• A lexical token is a sequence of characters that can be treated as a
unit in the grammar of the programming languages, like a word in a
sentence.
• Example of tokens:
• Type token like types: int, float, char…
• Punctuation tokens like symbols: {, }, (, ),….
• Alphabetic tokens like keywords or identifiers
Example
In int count = 10;, the tokens are:

• int (a type token)

• count (an identifier)
• = (an assignment operator)
• 10 (a number)
• ; (a punctuation token)
•A lexeme is a sequence of characters that are included in the
source program according to the matching pattern of a token.
It is nothing but an instance of a token.
•Example of Lexeme:
•If we have a token for numbers, a lexeme could be 273 or
3.14.
•For a keyword token, the lexemes could be float, if, or return.
Example
In the statement int number = 5;, the lexemes are:

• int
• number
•=
•5
•;
• A pattern defines the structure of tokens. It’s like a rule that the
lexical analyzer uses to recognize lexemes for a particular token type.

Examples of Patterns:
•For a number token, the pattern might be a sequence of digits (e.g., 0-
9).
•For a variable token, the pattern could be a sequence of alphabetic
characters followed by numbers (e.g., abc123).
•For a keyword token, the pattern might simply match exact words like
if, while, or return.
Lexical Analysis (Example 1)
int main()
{
// 2 variables
int a, b;
a = 10;
return 0;
}
Mention All the valid tokens are?
Lexical Analysis (Example 1)
Answer : 'int' 'main' '(' ')' '{' 'int' 'a' ',' 'b' ';‘ 'a' '=' '10' ';' 'return'
'0' ';' '}‘
Above are the valid tokens. Observe that it omitted the comments.

As another example & solve the following statement:

printf (“Lexical Analysis”);
Lexical Analysis (Example 2)
# This is a python code
x = 10
y=x+5
print("The result is:", y)
During lexical analysis, the Python interpreter will break down this code into tokens:

Comments: # This is a python code

•Ignored: Comments are ignored by the lexical analyzer.
Tokens in Line 2: x = 10
•x → Identifier (variable name)
•= → Assignment operator
•10 → Numeric literal (integer)
Tokens in Line 3: y = x + 5
•y → Identifier
•= → Assignment operator
•x → Identifier
•+ → Arithmetic operator
•5 → Numeric literal (integer)
Tokens in Line 4: print("The result is:", y)
•print → Keyword (Python built-in function)
•( → Left parenthesis(symbol)
•"The result is:" → String literal
•, → Comma(symbol)
•y → Identifier
•) → Right parenthesis(symbol)
Syntactic Analysis

• Syntactic analysis is defined as analysis that tells us the logical

meaning of certainly given sentences or parts of those sentences.
• In simple words, Syntactic analysis is the process of analyzing natural
language with the rules of formal grammar.
• It analyses the syntactical structure and checks if the given input is in
the correct syntax of the programming language or not.
• Syntax Analysis in Compiler Design process comes after the Lexical
analysis phase.
Syntactic Analysis

• The syntactic analysis basically assigns a semantic structure to text. It is

also known as syntax analysis or parsing.
• The word ‘parsing’ is originated from the Latin word ‘pars’ which means
‘part’. The syntactic analysis deals with the syntax of Natural Language.
• Consider the following sentence:
Sentence: School go a boy
• The above sentence does not logically convey its meaning, and its
grammatical structure is not correct. So, Syntactic analysis tells us whether
a particular sentence conveys its logical meaning or not and whether its
grammatical structure is correct or not.
Difference between Lexical and
Syntactic analysis
1. Every gardener likes the sun.
2. Like sun the every gardener?
In both sentences, all the words are the same, but only the first sentence is
syntactically correct and easily understandable. But we cannot make these
distinctions using Basic lexical processing techniques
• The aim of lexical analysis is in Data Cleaning and Feature Extraction with the
help of techniques such as Stemming, Lemmatization, Correcting misspelled
words, etc.
• But on the contrary, in syntactic analysis, our target is to : Find the roles played
by words in a sentence, Interpret the relationship between words, Interpret the
grammatical structure of sentences.
Why do you need Syntactic Analyzer?

• Check if the code is valid grammatically

• The syntactical analyzer helps you to apply rules to the code
• Helps you to make sure that each opening brace has a corresponding
closing balance
• Each declaration has a type and that the type must be exists
Program Slicing
• The essential idea of program slicing is to identify only those
statements that are of relevance to the slicing criterion – i.e. those
that affect or are affected by some statement.
• A program slice is a portion of a program with an execution behavior
identical to the initial program with respect to a given criterion, but
may have a reduced size.
• Slicing can move in two directions – forwards and backwards.
Backward slicing
• Starts from a variable or a set of variables and identifies the
statements that influence their values. It is useful for understanding
the causes of a particular variable's value.
Example
• Assuming you have a hypothetical code snippet that looks something
like this:
Criterion: S<[3]:sum>
1. a = 10
2. b = 20
3. sum = a + b
4. c = sum * 2
5. d = sum - 5
Based on the example code, the backward slice for sum would be:
[3]->[2]->[1]

1. a = 10
2. b = 20
3. sum = a + b
Forward slicing
• Starts from a statement or a point in the program and identifies the
statements that are influenced by it. It is useful for understanding
the consequences of a particular statement.
Example
Criteria: S<[3]:sum>

• 1. x = 5
• 2. y = 10
• 3. sum = x + y
• 4. result = sum * 2
• 5. final_result = result - 3
The forward slice starting from sum would
be:
[3]->[4]->[5]
• 3. sum = x + y
• 4. result = sum * 2
• 5. final_result = result - 3
Why Do We Need Slicing?
• Debugging: Focus on parts of program relevant for a bug.

• Program understanding: Which statements influence this statement?

• Change impact analysis: Which parts of a program are affected by a

change? What should be retested?
Reading Assignment
Cover the following TECHNIQUES USED FOR REVERSE ENGINEERING:
• Control Flow Analysis
• Data Flow Analysis
• Visualization

System Programming Unit-2 by Arun Pratap Singh
100% (1)
System Programming Unit-2 by Arun Pratap Singh
82 pages
Compiler Construction II Handout
100% (1)
Compiler Construction II Handout
27 pages
Introduction To Compiler Design-Unit I
No ratings yet
Introduction To Compiler Design-Unit I
30 pages
Natural Language Toolkit Tutorial
100% (1)
Natural Language Toolkit Tutorial
109 pages
Question Bank 2023-24 Computer Science
No ratings yet
Question Bank 2023-24 Computer Science
146 pages
100 Top Compiler Design Important Questions and Answers PDF
50% (2)
100 Top Compiler Design Important Questions and Answers PDF
20 pages
Java Tokens
100% (1)
Java Tokens
11 pages
Mikrobasic Pro Pic32 v100-23513
No ratings yet
Mikrobasic Pro Pic32 v100-23513
631 pages
Compiler Construction CS-4207: Lecture 4-5 Instructor Name: Atif Ishaq
100% (1)
Compiler Construction CS-4207: Lecture 4-5 Instructor Name: Atif Ishaq
37 pages
Compiler Design
No ratings yet
Compiler Design
117 pages
Compiler Design Lab Manual For r13 PDF
100% (2)
Compiler Design Lab Manual For r13 PDF
52 pages
Unit 1
No ratings yet
Unit 1
109 pages
Case Study of Lexical Analyzer PDF
100% (1)
Case Study of Lexical Analyzer PDF
3 pages
Lexical Analysis in Compiler Design With Example
No ratings yet
Lexical Analysis in Compiler Design With Example
8 pages
Natural Language Processing (NLP) : April 2024
No ratings yet
Natural Language Processing (NLP) : April 2024
88 pages
Acd 2.1
No ratings yet
Acd 2.1
20 pages
Compiler Manual
No ratings yet
Compiler Manual
62 pages
AI MCQ
No ratings yet
AI MCQ
18 pages
NLP Practicals All
No ratings yet
NLP Practicals All
57 pages
Lexical Analyzer Synopsis Final
0% (1)
Lexical Analyzer Synopsis Final
20 pages
Python Basics-1 PDF
No ratings yet
Python Basics-1 PDF
19 pages
C AllinOne
No ratings yet
C AllinOne
398 pages
Lexical Analysis - Compiler Design: Token, Pattern and Lexeme
No ratings yet
Lexical Analysis - Compiler Design: Token, Pattern and Lexeme
5 pages
2-Lexical Analysis
No ratings yet
2-Lexical Analysis
52 pages
Cdlab Manual 2020-21 Updated
No ratings yet
Cdlab Manual 2020-21 Updated
91 pages
NXC Guide
No ratings yet
NXC Guide
123 pages
Compiler Design
No ratings yet
Compiler Design
7 pages
CD - CH2 - Lexical Analysis
No ratings yet
CD - CH2 - Lexical Analysis
67 pages
Use JavaCC To Build A User Friendly
No ratings yet
Use JavaCC To Build A User Friendly
21 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
74 pages
CD - CH2 - Lexical Analysis
No ratings yet
CD - CH2 - Lexical Analysis
59 pages
Chapter 2 Lexical Analysis (Scanning) Edited
No ratings yet
Chapter 2 Lexical Analysis (Scanning) Edited
46 pages
Module-1 1
No ratings yet
Module-1 1
53 pages
SS Manual GEC 18CSL66
No ratings yet
SS Manual GEC 18CSL66
49 pages
Introduction Compiler
No ratings yet
Introduction Compiler
47 pages
Lect 05
No ratings yet
Lect 05
38 pages
CSC 461 Final
No ratings yet
CSC 461 Final
170 pages
R.V. College of Engineering
No ratings yet
R.V. College of Engineering
56 pages
CD - Ch.1
No ratings yet
CD - Ch.1
28 pages
Chapter 2-Lexical Analysis
No ratings yet
Chapter 2-Lexical Analysis
48 pages
CD UNIT-1
No ratings yet
CD UNIT-1
60 pages
CH1 3
No ratings yet
CH1 3
32 pages
Lexical Analysis and Parsing CD
No ratings yet
Lexical Analysis and Parsing CD
107 pages
CC 2
No ratings yet
CC 2
65 pages
Chapter 1
No ratings yet
Chapter 1
28 pages
Comp Chap2
No ratings yet
Comp Chap2
36 pages
CD Laqs
No ratings yet
CD Laqs
29 pages
(Week 3) Syntax Analysis (Derivation)
No ratings yet
(Week 3) Syntax Analysis (Derivation)
46 pages
Compiler Design 1
No ratings yet
Compiler Design 1
26 pages
Lexical Analysis
No ratings yet
Lexical Analysis
35 pages
Lecture 4 Lexical Analysis
No ratings yet
Lecture 4 Lexical Analysis
23 pages
Pemodelan Proses Bisnis: K Candra Brata
No ratings yet
Pemodelan Proses Bisnis: K Candra Brata
48 pages
Chapter 2 Lexical Analysis (Scanning)
No ratings yet
Chapter 2 Lexical Analysis (Scanning)
56 pages
Compiler Constructer
No ratings yet
Compiler Constructer
17 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
16 pages
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
No ratings yet
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
52 pages
CD - Ch.1
No ratings yet
CD - Ch.1
28 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
16 pages
CD Unit 1
No ratings yet
CD Unit 1
35 pages
Introduction To Compiler Design - Solutions
No ratings yet
Introduction To Compiler Design - Solutions
23 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
31 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
14 pages
CD Chapter 1
No ratings yet
CD Chapter 1
28 pages
Lexical Analysis
No ratings yet
Lexical Analysis
12 pages
Comp Final
No ratings yet
Comp Final
16 pages
Compiler Construction Notes After Mid
No ratings yet
Compiler Construction Notes After Mid
18 pages
Unit 1
No ratings yet
Unit 1
24 pages
Role of A Lexical AN
No ratings yet
Role of A Lexical AN
26 pages
Enhancing Search Capabilities: Exploring Lucene and Solr Techniques For Improved Search Performance
No ratings yet
Enhancing Search Capabilities: Exploring Lucene and Solr Techniques For Improved Search Performance
10 pages
Chapter 6
No ratings yet
Chapter 6
9 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
18 pages
Chapter 2
No ratings yet
Chapter 2
6 pages
3.7 LeX
No ratings yet
3.7 LeX
13 pages
Lexical Analysis
No ratings yet
Lexical Analysis
9 pages
3.role of Lexical Analyzer
No ratings yet
3.role of Lexical Analyzer
4 pages
12 CS - 22-23 (1) - 24-24
No ratings yet
12 CS - 22-23 (1) - 24-24
4 pages
Unit 2
No ratings yet
Unit 2
14 pages
Model Query Tokenization and Character Matching A
No ratings yet
Model Query Tokenization and Character Matching A
7 pages
Unit 4
No ratings yet
Unit 4
16 pages
Compilers and Translators Assignment
No ratings yet
Compilers and Translators Assignment
3 pages
CD Notes
No ratings yet
CD Notes
7 pages
Certificate Declaration: Topic Name
No ratings yet
Certificate Declaration: Topic Name
16 pages
CODE
No ratings yet
CODE
6 pages
Java Objects
No ratings yet
Java Objects
9 pages
Upload 1
No ratings yet
Upload 1
3 pages
Digital Signal Processing Unit I Discrete Time Signals and Systems
No ratings yet
Digital Signal Processing Unit I Discrete Time Signals and Systems
6 pages
Python Regular Expressions Explained: A Practical Guide with Examples
From Everand
Python Regular Expressions Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Problem Solving in C and Python: Programming Exercises and Solutions, Part 1
From Everand
Problem Solving in C and Python: Programming Exercises and Solutions, Part 1
Yana Kortsarts
4.5/5 (2)
Coding for beginners The basic syntax and structure of coding
From Everand
Coding for beginners The basic syntax and structure of coding
Diamond Moore
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet

Lecture 7

Uploaded by

Lecture 7

Uploaded by

SOFTWARE RE-ENGINEERING

• Lexical analysis is the process of decomposing the sequence of

• int (a type token)

As another example & solve the following statement:

Comments: # This is a python code

• Syntactic analysis is defined as analysis that tells us the logical

• The syntactic analysis basically assigns a semantic structure to text. It is

• Check if the code is valid grammatically

• Program understanding: Which statements influence this statement?

• Change impact analysis: Which parts of a program are affected by a

You might also like