Week 12- Distributed Databases
Week 12- Distributed Databases
Objectives
• Concepts.
• Advantages and disadvantages of distributed
databases.
• Functions and architecture for a DDBMS.
• Distributed database design.
3
Concepts
Distributed Database
A logically interrelated collection of shared data (and a
description of this data), physically distributed over a
computer network.
Distributed DBMS
Software system that permits the management of the
distributed database and makes the distribution transparent
to users.
4
Concepts
• Collection of logically-related shared data.
• Data split into fragments.
• Fragments may be replicated.
• Fragments/replicas allocated to sites.
• Sites linked by a communications network.
• Data at each site is under control of a DBMS.
• DBMSs handle local applications autonomously.
• Each DBMS participates in at least one global
application.
5
Distributed DBMS
6
Distributed Processing
A centralized database that can be accessed over a
computer network.
7
Types of DDBMS
• Homogeneous DDBMS
• Heterogeneous DDBMS
8
Homogeneous DDBMS
• All sites use same DBMS product.
• Much easier to design and manage.
• Approach provides incremental growth and allows
increased performance.
9
Heterogeneous DDBMS
• Sites may run different DBMS products, with
possibly different underlying data models.
• Occurs when sites have implemented their own
databases and integration is considered later.
• Translations required to allow for:
• Different hardware.
• Different DBMS products.
• Typical solution is to use gateways.
10
Functions of a DDBMS
• Expect DDBMS to have at least the functionality of a
DBMS.
• Also to have following functionality:
• Extended communication services.
• Extended Data Dictionary.
• Distributed query processing.
• Extended concurrency control.
• Extended recovery services.
14
Components of a DDBMS
18
• Fragmentation,
• Allocation,
• Replication.
19
Fragmentation
• Definition and allocation of fragments carried out
strategically to achieve:
• Locality of Reference.
• Improved Reliability and Availability.
• Improved Performance.
• Balanced Storage Capacities and Costs.
• Minimal Communication Costs.
21
Data Allocation
• Four alternative strategies regarding placement of
data:
• Centralized,
• Partitioned (or Fragmented),
• Complete Replication,
• Selective Replication.
22
Data Allocation
Centralized: Consists of single database and DBMS
stored at one site with users distributed across the
network.
Partitioned: Database partitioned into disjoint
fragments, each fragment assigned to one site.
Complete Replication: Consists of maintaining
complete copy of database at each site.
Selective Replication: Combination of partitioning,
replication, and centralization.
23
Why Fragment?
• Usage
• Applications work with views rather than entire relations.
• Efficiency
• Data is stored close to where it is most frequently used.
• Data that is not needed by local applications is not stored.
25
Why Fragment?
• Parallelism
• With fragments as unit of distribution, transaction can be
divided into several subqueries that operate on fragments.
• Security
• Data not required by local applications is not stored and so
not available to unauthorized users.
26
Why Fragment?
• Disadvantages
• Performance,
• Integrity.
27
Types of Fragmentation
• Four types of fragmentation:
• Horizontal,
• Vertical,
• Mixed,
• Derived.
Mixed Fragmentation
30
Horizontal Fragmentation
• Consists of a subset of the tuples of a relation.
• Defined using Selection operation of relational
algebra:
p(R)
• For example:
P1 = type=‘House’(PropertyForRent)
P2 = type=‘Flat’(PropertyForRent)
31
Vertical Fragmentation
• Consists of a subset of attributes of a relation.
• Defined using Projection operation of relational
algebra:
a1, ... ,an(R)
• For example:
S1 = staffNo, position, sex, DOB, salary(Staff)
S2 = staffNo, fName, lName, branchNo(Staff)
32
Mixed Fragmentation
• Consists of a horizontal fragment that is vertically
fragmented, or a vertical fragment that is horizontally
fragmented.
• Defined using Selection and Projection operations of
relational algebra:
S21 = branchNo=‘B003’(S2)
S22 = branchNo=‘B005’(S2)
S23 = branchNo=‘B007’(S2)
34
Ri = R F Si, 1iw
35