Topic 7 DDBMS
Topic 7 DDBMS
MANAGEMENT SYSTEMS
Site 1
Site 2
Site 3 Site 4
Centralized database and single processing
Advantages Disadvantages
• Data are located near greatest demand • Complexity of management and control
site • Technological difficulty
• Faster data access and processing • Security
• Growth facilitation • Lack of standards
• Improved communications • Increased storage and infrastructure
• Reduced operating costs requirements
• User-friendly interface • Increased training cost
• Less danger of a single-point failure • Costs incurred due to the requirement of
• Processor independence duplicated infrastructure
Distributed Database Transparency
❖ Distribution transparency : user does not know where data is located and if
replicated or partitioned
❖ Transaction transparency : transaction can update at several network sites to ensure
data integrity. Ensures transaction will be either entirely completed or aborted
❖ Failure transparency : ensures system continues to operate in the event of a node
failure (other nodes pick up lost functionality)
❖ Performance transparency : allows system to perform as if it were a centralized
DBMS. No performance degradation due to use of a network or platform differences.
Ensures system will find the most cost-effective path to access remote data.
❖ Heterogeneity transparency : allows the integration of several different local
DBMSs under a common schema
Distribution Transparency
❖ Allows management of a physically dispersed database as though it were a
centralized database
❖ Supported by a distributed data dictionary (DDD) which contains the description
of the entire database as seen by the DBA - The DDD is itself distributed and
replicated at the network nodes
❖ There are 3 levels of transparency which are fragmentation transparency,
location transparency and local mapping transparency
1. Fragmentation transparency – user does not need to know if a database is partitioned;
fragment names and/or fragment locations are not needed
2. Location transparency – fragment name is required but location is not required
3. Local mapping transparency – user must specify fragment name and location
As an example shown in previous slide:
Suppose we want to find all employees with a birthdate prior to Jan 1, 1940
Fragmentation transparency-
SELECT * FROM EMPLOYEE WHERE EMP_DOB < ’01-JAN-1940’;
Location transparency-
SELECT * FROM E1 WHERE EMP_DOB < ’01-JAN-1940’
UNION
SELECT * FROM E2 WHERE EMP_DOB < ’01-JAN-1940’
UNION
SELECT * FROM E3 WHERE EMP_DOB < ’01-JAN-1940’
Data replication
Data allocation
Vertical fragmentation:
◦ Division of a relation into attribute (column) subsets
◦ Each fragment must contain the primary key
Mixed fragmentation:
◦ Combination of horizontal and vertical strategies
Example :
Horizontal Fragmentation of the CUSTOMER Table
by State
Vertically Fragmented Table Contents
Two separate areas in the company use different fields of the table in the daily activities – the SERVICE dept and the
COLLECTIONS dept
Mixed Fragmentation of the CUSTOMER Table
The table is divided horizontally by the three states and within each state there is a vertical fragmentation by
department
Table Content After the Mixed Fragmentation Process
Correctness Rules
Completeness
◦ Any data item must be covered by at least one fragment
Reconstruction
◦ There exist relation operators (select, project, join…) to reconstruct the original table from
fragments
◦ Horizontal fragmentation - Union
◦ Vertical fragmentation - Join
◦ Hybrid fragmentation - combination of relational operators
Disjointness
◦ Remove duplicate data
◦ Horizontal fragmentation - a row cannot appear in more than one fragment.
◦ Vertical fragmentation – non-key attributes cannot appear in more than one fragment.
Data Replication
❖ Storage of data copies at multiple sites served by
a computer network
❖ Fragment copies can be stored at several sites to
serve specific information requirements
❖ Can enhance data availability and response time
❖ Can help to reduce communication and total
query costs
❖ Imposes additional processing overhead
❖ Which copy do you read when submitting a query
❖ All copies must be updated when a write occurs
Data Replication Scenarios
❖ Un-replicated database:
Stores each database fragment at a single site
No duplicate database fragments
Database size, usage frequency and costs (performance, overhead, management)
influence the decision to replicate
Data Allocation
Deciding where to locate data: Data distribution over a computer network is achieved
through data partition, data replication, or a combination of both
Allocation strategies:
❖Centralized data allocation
◦ Entire database is stored at one site