0% found this document useful (0 votes)
5 views

Week 12- Distributed Databases

The document discusses the concepts, advantages, and design of Distributed Database Management Systems (DDBMS), highlighting the differences between homogeneous and heterogeneous systems. It covers key topics such as fragmentation, allocation, and replication strategies, as well as the importance of interoperability and open database access. Additionally, it outlines a methodology for designing distributed databases, emphasizing the need for careful consideration of system topology and transaction analysis.

Uploaded by

hayaiman719
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Week 12- Distributed Databases

The document discusses the concepts, advantages, and design of Distributed Database Management Systems (DDBMS), highlighting the differences between homogeneous and heterogeneous systems. It covers key topics such as fragmentation, allocation, and replication strategies, as well as the importance of interoperability and open database access. Additionally, it outlines a methodology for designing distributed databases, emphasizing the need for careful consideration of system topology and transaction analysis.

Uploaded by

hayaiman719
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

1

DISTRIBUTED DBMS – CONCEPTS


AND DESIGN

Fiaz Majeed (University of Gujrat)


2

Objectives
• Concepts.
• Advantages and disadvantages of distributed
databases.
• Functions and architecture for a DDBMS.
• Distributed database design.
3

Concepts
Distributed Database
A logically interrelated collection of shared data (and a
description of this data), physically distributed over a
computer network.

Distributed DBMS
Software system that permits the management of the
distributed database and makes the distribution transparent
to users.
4

Concepts
• Collection of logically-related shared data.
• Data split into fragments.
• Fragments may be replicated.
• Fragments/replicas allocated to sites.
• Sites linked by a communications network.
• Data at each site is under control of a DBMS.
• DBMSs handle local applications autonomously.
• Each DBMS participates in at least one global
application.
5

Distributed DBMS
6

Distributed Processing
A centralized database that can be accessed over a
computer network.
7

Types of DDBMS
• Homogeneous DDBMS
• Heterogeneous DDBMS
8

Homogeneous DDBMS
• All sites use same DBMS product.
• Much easier to design and manage.
• Approach provides incremental growth and allows
increased performance.
9

Heterogeneous DDBMS
• Sites may run different DBMS products, with
possibly different underlying data models.
• Occurs when sites have implemented their own
databases and integration is considered later.
• Translations required to allow for:
• Different hardware.
• Different DBMS products.
• Typical solution is to use gateways.
10

Open Database Access and Interoperability


• Open Group formed a Working Group to provide
specifications that will create a database
infrastructure environment where there is:
• Common SQL API that allows client applications
to be written that do not need to know vendor of
DBMS they are accessing.
• Common database protocol that enables DBMS
from one vendor to communicate directly with
DBMS from another vendor without the need for a
gateway.
• A common network protocol that allows
communications between different DBMSs.
11

Open Database Access and Interoperability


• Most ambitious goal is to find a way to enable
transaction to span DBMSs from different vendors
without use of a gateway.
12

Multidatabase System (MDBS)


DDBMS in which each site maintains complete
autonomy.
• DBMS that resides transparently on top of existing
database and file systems and presents a single
database to its users.
• Allows users to access and share data without
requiring physical database integration.
13

Functions of a DDBMS
• Expect DDBMS to have at least the functionality of a
DBMS.
• Also to have following functionality:
• Extended communication services.
• Extended Data Dictionary.
• Distributed query processing.
• Extended concurrency control.
• Extended recovery services.
14

Reference Architecture for DDBMS


• Due to diversity, no accepted architecture equivalent
to ANSI/SPARC 3-level architecture.
• A reference architecture consists of:
• Set of global external schemas.
• Global conceptual schema (GCS).
• Fragmentation schema and allocation schema.
• Set of schemas for each local DBMS conforming to
3-level ANSI/SPARC.
15

Reference Architecture for DDBMS


16

Reference Architecture for MDBS


• In DDBMS, GCS is union of all local conceptual
schemas.
• GCS of tightly coupled system involves integration
of either parts of LCSs or local external schemas.
17

Components of a DDBMS
18

Distributed Database Design


• Three key issues:

• Fragmentation,
• Allocation,
• Replication.
19

Distributed Database Design


Fragmentation
Relation may be divided into a number of sub-relations,
which are then distributed.
Allocation
Each fragment is stored at site with “optimal” distribution.
Replication
Copy of fragment may be maintained at several sites.
20

Fragmentation
• Definition and allocation of fragments carried out
strategically to achieve:
• Locality of Reference.
• Improved Reliability and Availability.
• Improved Performance.
• Balanced Storage Capacities and Costs.
• Minimal Communication Costs.
21

Data Allocation
• Four alternative strategies regarding placement of
data:
• Centralized,
• Partitioned (or Fragmented),
• Complete Replication,
• Selective Replication.
22

Data Allocation
Centralized: Consists of single database and DBMS
stored at one site with users distributed across the
network.
Partitioned: Database partitioned into disjoint
fragments, each fragment assigned to one site.
Complete Replication: Consists of maintaining
complete copy of database at each site.
Selective Replication: Combination of partitioning,
replication, and centralization.
23

Comparison of Strategies for Data


Distribution
24

Why Fragment?
• Usage
• Applications work with views rather than entire relations.
• Efficiency
• Data is stored close to where it is most frequently used.
• Data that is not needed by local applications is not stored.
25

Why Fragment?
• Parallelism
• With fragments as unit of distribution, transaction can be
divided into several subqueries that operate on fragments.
• Security
• Data not required by local applications is not stored and so
not available to unauthorized users.
26

Why Fragment?
• Disadvantages

• Performance,
• Integrity.
27

Types of Fragmentation
• Four types of fragmentation:

• Horizontal,
• Vertical,
• Mixed,
• Derived.

• Other possibility is no fragmentation:

• If relation is small and not updated frequently, may be better


not to fragment relation.
28

Horizontal and Vertical Fragmentation


29

Mixed Fragmentation
30

Horizontal Fragmentation
• Consists of a subset of the tuples of a relation.
• Defined using Selection operation of relational
algebra:
p(R)

• For example:

P1 =  type=‘House’(PropertyForRent)
P2 =  type=‘Flat’(PropertyForRent)
31

Vertical Fragmentation
• Consists of a subset of attributes of a relation.
• Defined using Projection operation of relational
algebra:
a1, ... ,an(R)
• For example:
S1 = staffNo, position, sex, DOB, salary(Staff)
S2 = staffNo, fName, lName, branchNo(Staff)
32

Mixed Fragmentation
• Consists of a horizontal fragment that is vertically
fragmented, or a vertical fragment that is horizontally
fragmented.
• Defined using Selection and Projection operations of
relational algebra:

 p(a1, ... ,an(R)) or


a1, ... ,an(σp(R))
33

Example - Mixed Fragmentation


S1 = staffNo, position, sex, DOB, salary(Staff)
S2 = staffNo, fName, lName, branchNo(Staff)

S21 =  branchNo=‘B003’(S2)
S22 =  branchNo=‘B005’(S2)
S23 =  branchNo=‘B007’(S2)
34

Derived Horizontal Fragmentation


• A horizontal fragment that is based on horizontal
fragmentation of a parent relation.
• Ensures that fragments that are frequently joined
together are at same site.
• Defined using Semijoin operation of relational
algebra:

Ri = R F Si, 1iw
35

Example - Derived Horizontal Fragmentation


S3 =  branchNo=‘B003’(Staff)
S4 =  branchNo=‘B005’(Staff)
S5 =  branchNo=‘B007’(Staff)

Could use derived fragmentation for Property:

Pi = PropertyForRent branchNo Si, 3i5


36

Derived Horizontal Fragmentation


• If relation contains more than one foreign key, need
to select one as parent.
• Choice can be based on fragmentation used most
frequently or fragmentation with better join
characteristics.
37

Distributed Database Design Methodology


1. Use normal methodology to produce a design for
the global relations.
2. Examine topology of system to determine where
databases will be located.
3. Analyze most important transactions and identify
appropriateness of horizontal/vertical
fragmentation.
4. Decide which relations are not to be fragmented.
5. Examine relations on 1 side of relationships and
determine a suitable fragmentation schema.
Relations on many side may be suitable for
derived fragmentation.

You might also like