Data Reverse Eng.
Data Reverse Eng.
Reverse
Engineerin
g
(DRE)
Presented by:
Nashrah Tahir 1363
Saira Mehmood 1368
Minahil Ismail 1339
Maria Anees 1336
Table of contents
01 02 03
Introduction Purpose of DRE Database Reverse
Engineering (DBRE):
Methodology
04 05 06
Schema Recovery Data Structure Data Structure
in Data-Oriented Extraction Conceptualization
Applications
0
1
Introductio
n
Data Reverse Engineering
“The process of figuring out how data is structured, organized, and
stored in a system, often without having direct access to the
original design or source code.”
Key Goals:
1- Recover Valuable Data.
2- Make Data More Useful.
Two vital aspects of a DRE process
Knowledge Tentative
Documentation
acquisition requirements
Initial requirements that
Knowledge Acquisition is DRE improves the
are identified during the
a way of learning and documentation of
gathering information. It reverse engineering
existing systems,
involves steps like process of an existing
especially when the
collecting, analyzing, (operational) system.
original developers are
organizing, and checking
Essentially, it’s a way to no longer available for
information to understand
make sure the new system
advice. Maintenance of
it better. This process is
important for software legacy software is
can do everything the old
projects, especially when assisted by the new
system did, and more if
you're working on reverse documentation.
needed.
engineering
Purpose of Data Reverse Engineering
Integration Data
Data Conversion
Administration
It refers to combining One needs to understand
different software systems It all about organizing,
the logical connection
so they can work together managing, and controlling
data within an between the old database
smoothly.
2 key ways: organization. DRE helps and the new one before
● Logical Model as a make this process easier converting the old data.
Prerequisite for by analyzing existing Data conversion is the
Integration systems, understanding migration of the data
● Logical Model how the data works, and instance from the old
Represents How providing clear
database to the new one.
Software Will documentation,leads to
Function in Different more efficient data
Conditions management and
integration
Purpose of Data Reverse Engineering
● Key Focus:
to understand the database design and structure.
Three Types of Schemas
Physical Schema
● Describes the physical storage of
data on storage media.
● Includes indices, storage
structures, and performance
optimizations.
Importance of Schema Recovery
Facilitates Migration and
Understanding Legacy Systems
Integration
● Many older systems lack proper
documentation. ● Aids in transferring or
converting data when
● Schema recovery helps map out migrating to newer
how data is organized and how systems or integrating with
components interact. other applications.
The objective is to recover the complete logical schema comprising the explicit
constructs expressed in the data structures declaration statements of the program(s) as
well as the implicit constructs buried, essentially, in the procedural statements.
Explicit v/s
Implicit
Constructs
Explicit v/s Implicit Constructs
Two tables, linked by a foreign key, are declared. We can say that this
foreign key is an explicit construct, insofar as we have used a specific
statement to declare it.
Implicit Construct
The objective is to recover the complete logical schema comprising the explicit
constructs expressed in the data structures declaration statements of the program(s) as
well as the implicit constructs buried, essentially, in the procedural statements.
Output:
Logical
Schema
Input
Main processes:
In this methodology, data structures are extracted by means of the following main
processes:
● DMS–DDL text analysis. Data structure declaration statements in a given DDL,
found in the schema scripts and application programs, are analyzed to produce
an approximate logical schema.
● Program analysis. This means analyzing the source code in order to detect integrity
constraints and evidences of additional data structures.
● Data analysis. This means analyzing the files and databases to (i) identify data
structures and their properties, namely, unique fields and functional dependen-
cies in files and (ii) test hypothesis such as “could this field be a foreign key to
this file?”
● Schema integration. The analyst is generally presented with several schemas while
processing more than one information source. Each of those multiple schemas
offers a partial view of the data objects. All those partial views are reflected on
the final logical schema via a process for schema integration
1 DMS-DDL text analysis:
The "Files and Records Declaration" represents
the DMS-DDL text . This shows the explicit SQL-
like statements that declare the CUSTOMER and
ORDER tables, along with their fields.
2 Program analysis:
The "Procedural Fragments" represents the program analysis. This shows some
3 Data analysis: pseudocode snippets that describe the operations performed on the data, like accepting the
The "Physical Schema" customer code, reading the customer information, and moving data between the
shows the initial physical CUSTOMER and ORDER entities.
data model extracted Analyzing these program constructs provides additional insights into the implicit
through data analysis. relationships and data flow between the entities.
This includes the
CUSTOMER and
ORDER tables along
with their fields and the
"ORD-CUS" accessor
that links them.
By examining the
physical schema and the
way the data is
organized and
accessed, the analyst
can make inferences
about potential foreign
key relationships and
4 Schema integration:
The "Logical Schema" represents the end result of the
other implicit constructs. schema integration process. It takes the insights from
the physical schema, procedural fragments, and DDL
analysis to produce a more refined, abstracted logical
data model.
This logical schema consolidates the entities, attributes,
and relationships in a way that better aligns with the
underlying business concepts and requirements.
Output:
Logical
Schema
0
6
Data Structure
Conceptualizat
ion
Data Structure Conceptualization
The process of deriving conceptual schemas from legacy databases.
It focuses on transforming physical data structures into logical and conceptual
schemas for better understanding and further development.