0% found this document useful (0 votes)

181 views46 pages

Distributed Databases

A distributed database system consists of multiple interconnected sites that each store fragments of data. The database is logically split into fragments that can be stored across different sites. This allows for data sharing across sites while also providing local autonomy and increased reliability if a site fails. The key aspects of distributed database management include how the data is fragmented, which fragments are replicated, and where the fragments and replicas are located across sites.

Uploaded by

Soumya Vijoy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

181 views46 pages

Distributed Databases

Uploaded by

Soumya Vijoy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

DISTRIBUTED

DATABASES

Distributed Database

In distributed database system the database is stored on

several computers.
The computers in a distributed system communicate with
one another through various communication media, such
as high-speed networks or telephone lines.
Computers in a distributed system also referred to as
sites or nodes
It consist of single logical database that is split into a
number of fragments .
Each fragment is stored on one or more computers under
the control of separate DBMS.
logically interrelated collection of shared data physically
distributed over a computer network is called a distributed
database

Ex:
One bank have branches all over India &
its head office is in Delhi.
Assume bank maintains local data in
Local Branch and copy of data of all
branches at Delhi.
Data is distributed all over India.
This eases query processing for local
customers of a branch & also of a global
customer.

Bank using distributed

processing

Mumbai

Delhi

Chennai

Bangalore
(Head Office)
Agra

Local
Branch
Local
Branch
Local
Branch
Local
Branch

A distributed database system consists of a

collection of sites connected together via
some kind of communications network, in
which :
each site is a database system site in its
own right;
the sites agree to work together, so that
a user at any site can access data
anywhere in the network exactly as if
the data were all stored at the user's
own site

Distributed DBMS.
Software system that permits the
management of the distributed database and
makes the distribution transparent to users.

Characterstics of DDBMS

Collection of logically-related shared data.

Data split into fragments.
Fragments may be replicated.
Fragments/replicas allocated to sites.
Sites linked by a communications network.
Data at each site is under control of a DBMS.
DBMSs handle local applications autonomously.
Each DBMS participates in at least one global
application.

Advantages of DDBMS
1. Data sharing
.If a number of different sites are connected
to each other, then a user at one site may be
able to access data that is available at another
site.
For example, in the distributed banking
system,
is possible for a user in one branch
2.
Local itAutonomy
toThe
access
dataadvantage
in another to
branch.
.
primary
accomplishing data
sharing by means of data distribution is that
each site is able to retain a degree of control
over data stored locally.

In a centralized system, the database administrator

of the central site controls the database.
In a distributed system, there is a global database
administrator responsible for the entire system.
A part of these responsibilities is delegated to the
local database administrator for each site.
each local administrator may have a different degree
of autonomy which is often a major advantage of
distributed databases.
3.Reliability and Availability

If one site fails in distributed system, the

remaining sited may be able to continue operating.
In particular, if data are replicated in several sites,
transaction needing a particular data item may find
it in several sites. Thus, the failure of a site does not
necessarily imply the shutdown of the system.

5. Modular Growth
Any time new nodes (computers) can be added to
the network without any difficulty.
6.Speedup Query Processing:
If a query involves data at several sites, it may be
possible to split the query into sub queries that
can be executed in parallel by several sites.
Such parallel computation allows for faster
processing of a users query.
In those cases in which data is replicated, queries
may be directed by the system to the least
heavily loaded sites.

Disadvantages of DDBMSs

Complexity -A distributed database is

more complicated to set up and maintain
compared to a central database.
Managing and controlling of ddms is
complex
Security-there is less security because
data is at so many different sites.
Distributed databases provides more
flexible accesses that increase the
chance of security violations since the
database can be accessed throughout
every site within the network.

Lack of Standards- there are no tools or

methodologies yet to help users convert a centralized
DBMS into a distributed DBMS.
Database Design More Complex-besides of the normal
difficulties, the design of a distributed database has to
consider fragmentation of data, allocation of
fragments to specific sites and data replication.
Cost- increased complexity and a more extensive
infrastructure means extra costs.
Lack of Experience-distributed databases are difficult
to work with, and as a young field there is not much
readily available experience on proper practice.

local and global transactions

local transaction accesses data in the

single site at which the transaction was
initiated.
A global transaction either accesses
data in a site different from the one at
which the transaction was initiated or
accesses data in several different sites.
Ensuring ACID properties of local
transcation can be done same as
normal transction. Ensuring ACID
properties of global transcation is
complex

Types of DDBMS

Homogeneous DDBMS
Heterogeneous DDBMS

Homogenous Distributed Database Systems

All

sites have identical software /schema

They are aware of each other and agree to
cooperate in processing user requests
Goal: provide a view of a single database,
hiding details of distribution. It appears to
user as a single system

Homogeneous Database

Identical DBMSs

All data is managed by the distributed

DBMS( no exclusively local data)
All access is through one, global schema
The global schema is the union of all the local

Heterogeneous Distributed
Database Systems

Data distributed across all the nodes

Different software/schema on different sites

Different DBMSs may be used at each node

Local access is done using the local DBMS and
schema
Remote access is done using the global
schema

Goal: integrate existing databases to provide useful

functionality

Typical Heterogeneous Environment

Non-identical DBMSs

Source: adapted from Bell and Grimson, 1992.

Distributed Database
Design

Design of ddms introduce 3 issues

How to partition database into fragments
Which fragments to replicate
Where to locate those fragments and replicas

Fragmentation and replication deals with

first 2 issues.allocation deal with 3 rd
issues

Fragmentation

Allocation

Relation may be divided into a number of subrelations, which are then distributed.
Each fragment is stored at site with "optimal"
distribution.

Replication

Copy of fragment may be maintained at

several sites.

Data Fragmentation

If the relation r relation r into fragments r1, r2, , rn which

contain sufficient information to reconstruct relation r.
3 Rules which must be followed:
Completeness - If a relation R is decomposed into
fragments R1,R2....Rn, each data item in R must appear in
at least one fragment
Reconstruction - It must be possible to define a relational
operation that will reconstruct R from the fragments
Disjointness - A data item must appear in only one
fragment - exception - Primary Key in vertical fragmentation
For horizontal fragmentation, data item is a tuple
For vertical fragmentation, data item is an attribute.

Types of fragmentation:

Three types of fragmentation:

Horizontal
Vertical
Mixed

Other possibility is no fragmentation:

If relation is small and not updated frequently, may
be better not to fragment relation.

Horizontal fragmentation
each tuple of r is assigned to one or more fragments
Example : relation account with following schema
Account = (account_number, branch_name , balance
)
account relation can be divided into several different
fragments,each of which consists of tuples of
accounts belonging to a particular branch.If the
banking system has only two branchesHillside and
Valleyviewthen there are two different fragments:

We reconstruct the relation r by taking the union of all fragments; that is,
r = r1 r2 r n

Horizontal Fragmentation of account

Relation
account_number branch_name
A-305
A-226
A-155

balance

Hillside
Hillside
Hillside

500
336
62

account1 = branch_name=Hillside
(account )
account_number branch_name
balance
A-177
A-402
A-408
A-639

Valleyview
Valleyview
Valleyview
Valleyview

account2 = branch_name=Valleyview
(account )

205
10000
1123
750

PROJ1: projects with budgets less than $200,000

PROJ2: projects with budgets greater than or equal
to
$200,000

Vertical fragmentation:

the schema for relation r is split into several smaller

schemas

All schemas must contain a common candidate key (or superkey) to

ensure lossless join property.
A special attribute, the tuple-id attribute may be added to each
schema to serve as a candidate key.

We can reconstruct the relation by taking natural join of

relations
r=r1
r2
r3
..
rn

Vertical Fragmentation of employee_info

Relation
branch_name customer_name

tuple_id

Lowman
1
Hillside
Camp
2
Hillside
Camp
3
Valleyview
Kahn
4
Valleyview
Kahn
5
Hillside
Kahn
6
Valleyview
Green
7
Valleyview
deposit1 = branch_name, customer_name, tuple_id (employee_info )
account_number

balance

tuple_id

500
A-305
1
336
A-226
2
205
A-177
3
10000
A-402
4
62
A-155
5
1123
A-408
6
750
A-639
7
deposit2 = account_number, balance, tuple_id (employee_info )

Horizontal and Vertical

Fragmentation

Mixed Fragmentation
Combination of horizontal and vertical

strategies
Is also called hybrid or nesting
A horizontal fragment that is subsequently
vertically fragmented, or a vertical fragment
that is then horizontally fragmented.
Mixed fragmentation is defined using
select and project operation of relation
algebra
Original relation can be obtained by join
and union operation

Advantages of
Fragmentation

Horizontal:
allows parallel processing on fragments of a
relation
allows a relation to be split so that tuples are
located where they are most frequently
accessed

Vertical:
allows tuples to be split so that each part of
the tuple is stored where it is most frequently
accessed
tuple-id attribute allows efficient joining of
vertical fragments

Disadvantages:

Performance - may be slower

Integrity - more difficult

Data Replication

System maintains multiple copies of data,

stored in different sites, for faster
retrieval and fault tolerance.
Two types replication
Full replication
Partial replication

Full replication

Full replication of a relation is the case

where the relation is stored at all sites.
Fully redundant databases are those in
which every site contains a copy of the
entire database.

Can be impractical due to amount of overhead

Partial

replication

Some importantant frequently used

fragments are only replicated
Most DDBMSs are able to handle the
partially replicated database well
Unreplicated

database

Stores each database fragment at single

site
No duplicate database fragments

Advantages of Replication
Availability: failure of site containing relation
r does not result in unavailability of r is
replicas exist.
Parallelism: queries on r may be processed
by several nodes in parallel.
Reduced
data transfer: relation r is
available locally at each site containing a
replica of r.

Disadvantages of Replication

Increased cost of updates: each replica of

relation r must be updated.

Increased complexity of concurrency control:

concurrent updates to distinct replicas may
lead to inconsistent data unless special
concurrency control mechanisms are
implemented.
One solution: choose one copy as primary copy

and apply concurrency control operations on

primary copy

Data allocation

Four alternative strategies regarding

placement of data:
Centralized
Partitioned (or Fragmented)
Complete Replication
Selective Replication
Data allocation algorithms consider variety of
factors like
performance,reliabitlity,availbility,storage
cost,communication cost

Centralized data allocation

entire DB is stored at one site with users
distributed across the network.
Partitioned data allocation

Complete Replication

Database partitioned into disjoint fragments, each

fragment assigned to one site.
Consists of maintaining complete copy of database at
each site.

Selective Replication

Combination of partitioning, replication, and

centralization.

erence Architecture for DDBMS

Due to diversity, no universally accepted

architecture such as the ANSI/SPARC 3level architecture.
A reference architecture consists of:

Set of global external schemas.

Global conceptual schema (GCS).
Fragmentation schema and allocation schema.
Set of schemas for each local DBMS conforming
to 3-level ANSI/SPARC .

Some levels may be missing, depending on

levels of transparency supported.

Global conceptual schema

the global conceptual is a logical description of the whole
data base as if it were not distributed.
In DDBMS, GCS is union of all local conceptual schemas.
fragmentation and allocation schema
The fragementation schema is a description of how the
data is to be logically partioned
Allocation schema is a description of where data is to be
located
Local schemas
Each local DBMS has its own set of schemas
The local mapping schema maps fragments in allocation
schema into external objects in the local data base

Components of a DDBMS

Local DBMS (LDBMS) component - It has its

own
local system catalog that stores
information about the data held at that site.
Data communications (DC) component is
the software that enables all sites to
communicate with each other.
Global System Catalog (GSC) - The GSC holds
information specific to the distributed nature
of
the system, such as the fragmentation and
allocation schemas.
Distributed DBMS component - is the
controlling unit of the entire system.

Distributed Database Recovery Methods
No ratings yet
Distributed Database Recovery Methods
58 pages
Distributed
No ratings yet
Distributed
83 pages
Advanced Database Chapter 6 and 7
No ratings yet
Advanced Database Chapter 6 and 7
30 pages
Unit 4 Distributed DBMS by ANS
No ratings yet
Unit 4 Distributed DBMS by ANS
12 pages
Adb CH 4
No ratings yet
Adb CH 4
14 pages
Distributed DB
No ratings yet
Distributed DB
16 pages
Distributed Database Concepts
No ratings yet
Distributed Database Concepts
52 pages
ADBS Chapter Seven
No ratings yet
ADBS Chapter Seven
22 pages
Advanced Data Base Management Systems
No ratings yet
Advanced Data Base Management Systems
35 pages
ADBMS
No ratings yet
ADBMS
84 pages
Types of Distributed Data Base System - 49724
No ratings yet
Types of Distributed Data Base System - 49724
37 pages
Distributed DBM S
No ratings yet
Distributed DBM S
67 pages
Distribution Database
No ratings yet
Distribution Database
52 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
4 pages
1 DDBMS Introduction
No ratings yet
1 DDBMS Introduction
18 pages
Understanding Distributed Databases Concepts
No ratings yet
Understanding Distributed Databases Concepts
56 pages
Assignment 01
No ratings yet
Assignment 01
6 pages
NoSQL & Distributed Databases Overview
No ratings yet
NoSQL & Distributed Databases Overview
124 pages
Distributed Database
100% (1)
Distributed Database
24 pages
Distributed Database Systems Guide
No ratings yet
Distributed Database Systems Guide
46 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
19 pages
Distributed Databases
No ratings yet
Distributed Databases
55 pages
Chapter 4 - Distributed Database System
No ratings yet
Chapter 4 - Distributed Database System
52 pages
Distributed Database System
No ratings yet
Distributed Database System
9 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
Distributed Database Systems Guide
No ratings yet
Distributed Database Systems Guide
24 pages
Distributeddbms Er. Inderjeet Bal
No ratings yet
Distributeddbms Er. Inderjeet Bal
60 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
25 pages
Database II: Distributed Databases
No ratings yet
Database II: Distributed Databases
15 pages
Advanced Database Chapter 7 Assignment PDF
No ratings yet
Advanced Database Chapter 7 Assignment PDF
7 pages
Week 12 - Distributed Databases
No ratings yet
Week 12 - Distributed Databases
37 pages
Distrubuted Database Concept
No ratings yet
Distrubuted Database Concept
22 pages
Topic 7 DDBMS
No ratings yet
Topic 7 DDBMS
28 pages
Overview of Distributed Databases
No ratings yet
Overview of Distributed Databases
14 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
26 pages
ADS Chapter 7 Distributed Database
No ratings yet
ADS Chapter 7 Distributed Database
16 pages
Distributed Database Systems Guide
No ratings yet
Distributed Database Systems Guide
5 pages
Midterm Elective Database Notes
No ratings yet
Midterm Elective Database Notes
14 pages
Chapter 6
No ratings yet
Chapter 6
28 pages
Types of Distributed Database Systems
No ratings yet
Types of Distributed Database Systems
27 pages
Distributed Databases Overview and Types
No ratings yet
Distributed Databases Overview and Types
44 pages
Distributed Database Systems Guide
No ratings yet
Distributed Database Systems Guide
26 pages
Distributed Database Essentials
No ratings yet
Distributed Database Essentials
18 pages
Chapter 7
No ratings yet
Chapter 7
22 pages
Overview of Distributed Databases
No ratings yet
Overview of Distributed Databases
16 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
30 pages
Distributed DB
No ratings yet
Distributed DB
146 pages
Chapter-7 Distributed Database Systems
No ratings yet
Chapter-7 Distributed Database Systems
40 pages
Ddbms Notes
No ratings yet
Ddbms Notes
21 pages
Distributed Database Frank Chinembiri and Florence-2
No ratings yet
Distributed Database Frank Chinembiri and Florence-2
42 pages
.Ashwani - Mishra
No ratings yet
.Ashwani - Mishra
7 pages
Lec 11. Distributed Database Systems
No ratings yet
Lec 11. Distributed Database Systems
11 pages
ADT Unit 1 To 5
No ratings yet
ADT Unit 1 To 5
160 pages
Distributed Systems
No ratings yet
Distributed Systems
25 pages
Unit 1 DD
No ratings yet
Unit 1 DD
22 pages
Lefikir PowerPoint
No ratings yet
Lefikir PowerPoint
15 pages
Directory Structure
100% (1)
Directory Structure
27 pages
Method Overloading in Java: Adding Numbers
No ratings yet
Method Overloading in Java: Adding Numbers
37 pages
File Allocation Methods Explained
No ratings yet
File Allocation Methods Explained
37 pages
Java Program Distribution Explained
No ratings yet
Java Program Distribution Explained
42 pages
Java Program Distribution Explained
No ratings yet
Java Program Distribution Explained
42 pages
Interprocess Synchronization Explained
No ratings yet
Interprocess Synchronization Explained
91 pages
C Programming Operators Guide
No ratings yet
C Programming Operators Guide
134 pages
File Handling in PHP
No ratings yet
File Handling in PHP
16 pages
Evolution of Programming Methodologies and Consepts of Oop
100% (1)
Evolution of Programming Methodologies and Consepts of Oop
48 pages
Understanding Firewalls and Proxies
No ratings yet
Understanding Firewalls and Proxies
20 pages
Module 5
No ratings yet
Module 5
78 pages
C Programming: Mastering Pointers
No ratings yet
C Programming: Mastering Pointers
147 pages
ER Relational Model
No ratings yet
ER Relational Model
40 pages
Dbms 2-1 Material
No ratings yet
Dbms 2-1 Material
174 pages
Example - 5.0 To 5.1 Upgrade Summary - Unix
No ratings yet
Example - 5.0 To 5.1 Upgrade Summary - Unix
8 pages
Past Paper Database MCQs
No ratings yet
Past Paper Database MCQs
1 page
Full Solution Manual For Database Systems: Design, Implementation, and Management 13th Edition Coronel All Chapters
100% (25)
Full Solution Manual For Database Systems: Design, Implementation, and Management 13th Edition Coronel All Chapters
41 pages
ML Training Data Csam Report-2023!12!23
No ratings yet
ML Training Data Csam Report-2023!12!23
19 pages
IT8 Lesson 1 - Transformation of ERD To Relational Schema (Midterms)
No ratings yet
IT8 Lesson 1 - Transformation of ERD To Relational Schema (Midterms)
24 pages
Databricks Data Engineer Study Guide
80% (5)
Databricks Data Engineer Study Guide
157 pages
Common Data Management in Ariba
No ratings yet
Common Data Management in Ariba
11 pages
SQL - Ineuron - Final
No ratings yet
SQL - Ineuron - Final
72 pages
Creating Parameter Dumps in Netact
100% (2)
Creating Parameter Dumps in Netact
21 pages
Ex 9 - DSCP Lab
No ratings yet
Ex 9 - DSCP Lab
5 pages
System Design in Information Management
No ratings yet
System Design in Information Management
2 pages
Data Analytics Seminar for IE Students
No ratings yet
Data Analytics Seminar for IE Students
7 pages
Oracle 1z0-083 Exam - Questions and Answers p4
No ratings yet
Oracle 1z0-083 Exam - Questions and Answers p4
7 pages
DWM Assignment Ques
No ratings yet
DWM Assignment Ques
38 pages
Overview of Apache Big Data Ecosystem
No ratings yet
Overview of Apache Big Data Ecosystem
25 pages
SQL CREATE TABLE Guide
No ratings yet
SQL CREATE TABLE Guide
9 pages
3 Designing Pastebin
No ratings yet
3 Designing Pastebin
9 pages
Unit-2 DBMS
No ratings yet
Unit-2 DBMS
47 pages
Q 1
No ratings yet
Q 1
4 pages
Assignment 2 DBMS January 2024
No ratings yet
Assignment 2 DBMS January 2024
10 pages
SSI Webshell Script Overview
No ratings yet
SSI Webshell Script Overview
32 pages
DM Final
No ratings yet
DM Final
137 pages
AlwaysOn Availability Groups Setup Checklist
No ratings yet
AlwaysOn Availability Groups Setup Checklist
8 pages
ETL Interview V - 07
No ratings yet
ETL Interview V - 07
158 pages
09 Evaluation
No ratings yet
09 Evaluation
22 pages
Oracle EAM SQL Statements
No ratings yet
Oracle EAM SQL Statements
9 pages
Airline Data Analysis Tool
No ratings yet
Airline Data Analysis Tool
28 pages
EBS R12: Install and Maintain Guide
No ratings yet
EBS R12: Install and Maintain Guide
5 pages

Distributed Databases

Uploaded by

Distributed Databases

Uploaded by

DISTRIBUTED

In distributed database system the database is stored on

Bank using distributed

A distributed database system consists of a

Collection of logically-related shared data.

In a centralized system, the database administrator

If one site fails in distributed system, the

The failure of one site must be detected

Complexity -A distributed database is

Lack of Standards- there are no tools or

local and global transactions

local transaction accesses data in the

Homogenous Distributed Database Systems

sites have identical software /schema

All data is managed by the distributed

Data distributed across all the nodes

Different software/schema on different sites

Different DBMSs may be used at each node

Goal: integrate existing databases to provide useful

Typical Heterogeneous Environment

Source: adapted from Bell and Grimson, 1992.

Design of ddms introduce 3 issues

Fragmentation and replication deals with

Copy of fragment may be maintained at

If the relation r relation r into fragments r1, r2, , rn which

Three types of fragmentation:

Other possibility is no fragmentation:

Horizontal Fragmentation of account

PROJ1: projects with budgets less than $200,000

the schema for relation r is split into several smaller

All schemas must contain a common candidate key (or superkey) to

We can reconstruct the relation by taking natural join of

Vertical Fragmentation of employee_info

Horizontal and Vertical

Performance - may be slower

Integrity - more difficult

System maintains multiple copies of data,

Full replication of a relation is the case

Can be impractical due to amount of overhead

Some importantant frequently used

Stores each database fragment at single

Increased cost of updates: each replica of

Increased complexity of concurrency control:

and apply concurrency control operations on

Four alternative strategies regarding

Centralized data allocation

Database partitioned into disjoint fragments, each

Combination of partitioning, replication, and

erence Architecture for DDBMS

Due to diversity, no universally accepted

Set of global external schemas.

Some levels may be missing, depending on

Global conceptual schema

Local DBMS (LDBMS) component - It has its

You might also like