0% found this document useful (0 votes)
23 views44 pages

Data Reverse Eng.

Uploaded by

Minahil Ismail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views44 pages

Data Reverse Eng.

Uploaded by

Minahil Ismail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

Data

Reverse
Engineerin
g
(DRE)
Presented by:
Nashrah Tahir 1363
Saira Mehmood 1368
Minahil Ismail 1339
Maria Anees 1336
Table of contents
01 02 03
Introduction Purpose of DRE Database Reverse
Engineering (DBRE):
Methodology

04 05 06
Schema Recovery Data Structure Data Structure
in Data-Oriented Extraction Conceptualization
Applications
0
1
Introductio
n
Data Reverse Engineering
“The process of figuring out how data is structured, organized, and
stored in a system, often without having direct access to the
original design or source code.”
Key Goals:
1- Recover Valuable Data.
2- Make Data More Useful.
Two vital aspects of a DRE process

Recover data assets that Reconstitute the recovered


are valuable data assets to make them
more useful
● It is about finding and extracting ● It is about transforming that data
important data from old systems. into a form that is useful for
● It involves extracting useful data current or future needs.
or information from an existing ● Once valuable data has been
system, particularly when it's extracted, the next step is to
poorly documented or outdated. reshape or reformat it to make it
more accessible and usable.
0
2
Purpose of
Data
Reverse
Purpose of Data Reverse Engineering

Knowledge Tentative
Documentation
acquisition requirements
Initial requirements that
Knowledge Acquisition is DRE improves the
are identified during the
a way of learning and documentation of
gathering information. It reverse engineering
existing systems,
involves steps like process of an existing
especially when the
collecting, analyzing, (operational) system.
original developers are
organizing, and checking
Essentially, it’s a way to no longer available for
information to understand
make sure the new system
advice. Maintenance of
it better. This process is
important for software legacy software is
can do everything the old
projects, especially when assisted by the new
system did, and more if
you're working on reverse documentation.
needed.
engineering
Purpose of Data Reverse Engineering

Integration Data
Data Conversion
Administration
It refers to combining One needs to understand
different software systems It all about organizing,
the logical connection
so they can work together managing, and controlling
data within an between the old database
smoothly.
2 key ways: organization. DRE helps and the new one before
● Logical Model as a make this process easier converting the old data.
Prerequisite for by analyzing existing Data conversion is the
Integration systems, understanding migration of the data
● Logical Model how the data works, and instance from the old
Represents How providing clear
database to the new one.
Software Will documentation,leads to
Function in Different more efficient data
Conditions management and
integration
Purpose of Data Reverse Engineering

Software Quality Component


Assessment Assessment Reuse
Refers to evaluating a The overall quality of a In the context of DRE this
software product's quality software system can be concept plays an important
and performance. role because DRE tools
assessed with DRE
techniques, because a and techniques allow
software engineers to
flawed design of a
access and extract parts
persistent data structure is or components of existing
likely to lead to faults in the software (like a database
software system. structure or specific
modules) that might be
useful in a new project
0
3
Database
Reverse
Engineering:
Methodology
Database Reverse Engineering
● Definition: The process of recovering the conceptual schema (specifications)
of data-oriented applications from their physical implementation.

● Purpose: A systematic approach to understanding and documenting legacy


database systems.

● Key Focus Areas:


1. Extracting and reconstructing database structures.
2. .Backward tracing from physical schemas to conceptual schemas.
Schema Recovery
● Schema recovery is the process of reconstructing the conceptual or logical
schema of a database system from its physical schema or existing
structures.

● It is especially useful when original schema documentation is missing or


outdated.

● Key Focus:
to understand the database design and structure.
Three Types of Schemas

Conceptual Schema Logical Schema


● A high-level, abstract view of the ● The model of how data is
data from the business or user structured in the DBMS.
perspective. ● Describes tables, keys, and
● Describes entities, relationships, relationships.
and attributes.

Physical Schema
● Describes the physical storage of
data on storage media.
● Includes indices, storage
structures, and performance
optimizations.
Importance of Schema Recovery
Facilitates Migration and
Understanding Legacy Systems
Integration
● Many older systems lack proper
documentation. ● Aids in transferring or
converting data when
● Schema recovery helps map out migrating to newer
how data is organized and how systems or integrating with
components interact. other applications.

Improves Maintainability Improves Maintainability

● Identifies inefficiencies or ● Helps in creating


errors in outdated database accurate queries and
structures, helping in reports by
optimization and easier understanding how
maintenance. data is structured.
Common Issues in DBRE
Lack of Documentation Weak DBMS Models Implicit Structures

Databases often lack Older DBMS can Some relationships and


proper or updated only express limited constraints are hidden in
documentation. database structures the application code.

Optimized Structures Awkward Design Obsolete Constructs

Redundant or Poor designs from Unused parts of the


unnormalized data is inexperienced database remain,
added for developers lead to adding complexity.
performance, flawed database
complicating design. structures.
Phases in DBRE process
A DBRE process is based on backward execution of the logical phase and the
physical phase, beginning with the results of the physical phase. The process is
divided into two main phases, namely
Data Structure
Data Structure Extraction
Conceptualization
● Extract the current structure ● Conceptualize the recovered
of the data from the DDL data structure to form a
(Data Definition Language) or logical or conceptual
host language. schema.
● Focuses on recovering the ● Describes the semantics
existing database and relationships underlying
structure. the existing data.
0
5
Data
Structure
Extraction
Data Structure Extraction
The complete DMS schema, including the structures and constraints, are recovered
in this phase.

The objective is to recover the complete logical schema comprising the explicit
constructs expressed in the data structures declaration statements of the program(s) as
well as the implicit constructs buried, essentially, in the procedural statements.
Explicit v/s
Implicit
Constructs
Explicit v/s Implicit Constructs

Explicit Construct Implicit Construct

An explicit construct is a An implicit construct is a


component or a property of a component or a property that
data struture that is declared holds in the data structure, but
through a specific DDL that has not been declared
statement explicitly
Explicit Construct

Two tables, linked by a foreign key, are declared. We can say that this
foreign key is an explicit construct, insofar as we have used a specific
statement to declare it.
Implicit Construct

No foreign keys have been declared, but strongly suggests that


column OWNER is expected to behave as a foreign key. If we are
convinced that this behavior must be taken for an absolute rule, then
OWNER is an implicit foreign key.
Data Structure Extraction
The complete DMS schema, including the structures and constraints, are recovered
in this phase.

The objective is to recover the complete logical schema comprising the explicit
constructs expressed in the data structures declaration statements of the program(s) as
well as the implicit constructs buried, essentially, in the procedural statements.
Output:
Logical
Schema

Input
Main processes:
In this methodology, data structures are extracted by means of the following main
processes:
● DMS–DDL text analysis. Data structure declaration statements in a given DDL,
found in the schema scripts and application programs, are analyzed to produce
an approximate logical schema.
● Program analysis. This means analyzing the source code in order to detect integrity
constraints and evidences of additional data structures.
● Data analysis. This means analyzing the files and databases to (i) identify data
structures and their properties, namely, unique fields and functional dependen-
cies in files and (ii) test hypothesis such as “could this field be a foreign key to
this file?”
● Schema integration. The analyst is generally presented with several schemas while
processing more than one information source. Each of those multiple schemas
offers a partial view of the data objects. All those partial views are reflected on
the final logical schema via a process for schema integration
1 DMS-DDL text analysis:
The "Files and Records Declaration" represents
the DMS-DDL text . This shows the explicit SQL-
like statements that declare the CUSTOMER and
ORDER tables, along with their fields.
2 Program analysis:
The "Procedural Fragments" represents the program analysis. This shows some
3 Data analysis: pseudocode snippets that describe the operations performed on the data, like accepting the
The "Physical Schema" customer code, reading the customer information, and moving data between the
shows the initial physical CUSTOMER and ORDER entities.
data model extracted Analyzing these program constructs provides additional insights into the implicit
through data analysis. relationships and data flow between the entities.
This includes the
CUSTOMER and
ORDER tables along
with their fields and the
"ORD-CUS" accessor
that links them.
By examining the
physical schema and the
way the data is
organized and
accessed, the analyst
can make inferences
about potential foreign
key relationships and
4 Schema integration:
The "Logical Schema" represents the end result of the
other implicit constructs. schema integration process. It takes the insights from
the physical schema, procedural fragments, and DDL
analysis to produce a more refined, abstracted logical
data model.
This logical schema consolidates the entities, attributes,
and relationships in a way that better aligns with the
underlying business concepts and requirements.
Output:
Logical
Schema
0
6
Data Structure
Conceptualizat
ion
Data Structure Conceptualization
The process of deriving conceptual schemas from legacy databases.
It focuses on transforming physical data structures into logical and conceptual
schemas for better understanding and further development.

Data Structure Conceptualization involves:


● Detecting and transforming redundancies
● Removing non-conceptual and technical structures
● Discarding DMS-dependent constructs
● Improving schema for simplicity and readability
Purpose of Conceptualization

● Extract business rules and domain semantics.


● Recover high-level conceptual constructs for system reengineering.
● Enable forward engineering by clarifying functional requirements.
Phases
● Basic Conceptualization
● Conceptual Normalization
Basic Conceptualization
Objective: Extract semantic concepts from the logical schema.
Steps:

● Making the Schema Ready:


○ Eliminate technical constructs (e.g., files, access keys).
○ Rename for meaningful representation.
○ Restructure schema before interpretation.
● Schema Untranslation:
○ Identify traces of technical translations in the schema.
○ Replace them with original conceptual constructs.
● Schema De-optimization:
○ Remove optimization constructs that complicate understanding.
Making the schema ready
Steps to Prepare the Schema:
● Remove unnecessary technical constructs.
○ Example: Files, access keys, and physical structures.
● Translate abstract or unclear names to meaningful ones.
● Restructure schema elements for logical grouping and readability
Schema Untranslation
Definition:
● Reverse technical translations in the schema.
● Restore original conceptual constructs.
Purpose:
● Identify how logical schema was derived from initial concepts.
● Replace technical representations with clear conceptual elements.
Schema De-optimization
Definition:
● Deconstructing logical schema optimizations were added for technical efficiency.
Why?
● Optimized schemas are harder to understand.
Action:
● Replace optimization-specific constructs with simpler, conceptual alternatives.
Conceptual Normalization
Objective: Restructure schema for simplicity and expressiveness.
Qualities Achieved:
● Simplicity
● Readability
● Minimality
● Extensibility
● Expressiveness
Key Transformations in Conceptual
Normalization
● Replace some entity types with relationship types.
● Convert entity types into attributes where appropriate.
● Make is-a relationships explicit (e.g., "Car is-a Vehicle").
● Standardize names for consistency and clarity.
Example
Advantages of Conceptual Schema
● Provides natural language representation of business rules.
● Ensures the schema is in 3NF (normalized).
● Eliminates transitive dependencies for a simpler understanding.
● Enhances forward development by aligning with functional requirements.
Thanks!
Do you have any questions?

You might also like