Chapter 6 - Distributed Database Management Systems
Chapter 6 - Distributed Database Management Systems
SYSTEMS
COURSE OUTLINE
Evolution of DDBMS
Advantages and Disadvantages of DDBMS
Distributed Processing and Distributed Database
Characteristics of DDBMS
DDBMS Components
Level of Data and Process Distribution
Distributed Database Transparency
Distribution Transparency
Distributed Database Design
MUHAMMAD HAMIZ MOHD RADZI
OBJECTIVES
At the end of this lesson, you should be able to:
Describe centralize DBMS with its features and problem
Describe the evolution of DDBMS
Explain the advantages and disadvantages of DDBMS
Explain the distributed processing and distributed database
Explain the characteristics of DDBMS
Describe the components of DDBMS
Explain SPSD, MPSD, MPMD
Application
program 3
(with data
semantics) MUHAMMAD HAMIZ MOHD RADZI
Database is located at the server and processing is split between server and client
to lessen data traffic on the network.
However, there a few limitations to the design of a database that makes DBMS is
not desirable.
Hence, concept of DDBMS is came into the context.
DDBMS means the database are fragmented into few parts and are allocated to
few places.
Even though the database are allocated into few places, the DDBMS treats the
database as single logical data.
Required that corporate data be stored in a single central site and data access
provided through dumb terminals
Site 5
Communication
Network
Site 4 Site 3
MUHAMMAD HAMIZ MOHD RADZI
PROBLEMS WITH CENTRALIZED DB
Performance degradation as number of remote sites grew
Possible availability problem: if the site with the database goes down, there can be
no data access.
MUHAMMAD HAMIZ MOHD RADZI
DDB CONCEPT
Instead of having one, centralized database, we are going to spread the data out
among various cities on the distributed network, each of which has its own
computer and data storage facilities.
Location transparency - The user just issues the query, and the result is returned.
It is not necessary to know where on the network the data being sought is located.
MUHAMMAD HAMIZ MOHD RADZI
DDBMS
A distributed database (DDB) is a collection of multiple, logically interrelated
databases distributed over a computer network.
Local autonomy - Paris employees, e.g., can take responsibility for Table F -
- its security, backup and recovery, and concurrency control.
The result would then be sent to the site that issued the query.
Application interface to interact with the end user, application programs, and other DBMSs
within the distributed database.
Validation to analyze data requests for syntax correctness.
Transformation to decompose complex requests into atomic data request components.
Query optimization to find the best access strategy. (Which database fragments must be
accessed by the query, and how must data updates, if any, be synchronized?)
Mapping to determine the data location of local and remote fragments.
I/O interface to read or write data from or to permanent local storage.
MUHAMMAD HAMIZ MOHD RADZI
Formatting to prepare the data for presentation to the end user or to an
application program.
Backup and recovery to ensure the availability and recoverability of the database
in case of a failure.
Transaction management to ensure that the data moves from one consistent state to
another.
This activity includes the synchronization of local and remote transactions as well as
transactions across multiple distributed segments.
MUHAMMAD HAMIZ MOHD RADZI
Must perform all the functions of a centralized DBMS
Must handle all necessary functions imposed by the distribution of data and
processing
Performance Heterogeneity
transparency transparency
Supported by a distributed data dictionary (DDD) which contains the description of the entire
database as seen by the DBA
The DDD is itself distributed and replicated at the network nodes
Suppose an employee wants to find all employees with a birthdate prior to jan
1, 1940
Fragmentation transparency-
SELECT * FROM EMPLOYEE WHERE EMP_DOB < ’01-JAN-1940’;
Location transparency-
SELECT * FROM E1 WHERE EMP_DOB < ’01-JAN-1940’ UNION SELECT * FROM E2 …
UNION SELECT * FROM E3…;
Data replication:
Data allocation:
Vertical fragmentation:
Division of a relation into attribute (column) subsets
Each fragment must contain the primary key
Mixed fragmentation:
Combination of horizontal and vertical strategies MUHAMMAD HAMIZ MOHD RADZI
MUHAMMAD HAMIZ MOHD RADZI
HORIZONTAL FRAGMENTATION EXAMPLE
Two separate areas in the company use different fields of the table in
the daily activities – the SERVICE dept and the COLLECTIONS dept
MUHAMMAD HAMIZ MOHD RADZI
MIXED FRAGMENTATION OF THE CUSTOMER TABLE
Reconstruction
There exist relation operators (select, project, join…) to reconstruct the original table from
fragments
Horizontal fragmentation - Union
Vertical fragmentation - Join
Hybrid fragmentation - combination of relational operators
Disjointness
Remove duplicate data
Horizontal fragmentation - a row cannot appear in more than one fragment.
Vertical fragmentation – non-key attributes cannot appear in more than one fragment.
MUHAMMAD HAMIZ MOHD RADZI
DATA REPLICATION
Storage of data copies at multiple sites served by a computer network
Un-replicated database:
Stores each database fragment at a single site
No duplicate database fragments
Database size, usage frequency and costs (performance, overhead, management)
influence the decision to replicate
MUHAMMAD HAMIZ MOHD RADZI
DATA ALLOCATION
Deciding where to locate data: Data distribution over a computer network is achieved
through data partition, data replication, or a combination of both
Allocation strategies:
Fundamental of Database Management Systems, Mark L. G., 2nd Edition, 2012, John
Wiley.