DDBMS

The document discusses Distributed Database Management Systems (DDBMS), highlighting the shift from centralized to distributed databases that enhance reliability, availability, and performance. It outlines the complexities of system design, including data fragmentation, replication, and the need for effective concurrency control and recovery mechanisms. Key components such as server, client, and communication software are detailed, along with techniques for ensuring global consistency and serializability across distributed transactions.

Uploaded by

mwendikimaiga21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views44 pages

DDBMS

Uploaded by

mwendikimaiga21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

DDBMS

• In a centralised database system, all system

components reside at a single computer site.
The components include the data, the DBMS
software and the associated secondary
storage devices such as disks for on-line
database storage and tapes for backup. A
centralised database can be accessed
remotely via terminals connected to the site.
DDBMS
• In recent years there has been a rapid trend
towards the distribution of computer systems
over multiple sites that are interconnected via
a communication network.
• A distributed database is a collection of data
that belongs logically to the same system but
is physically spread over the sites of a
computer network. Advantages of a
distributed database system includes:
DDBMS
• Distributed nature of database application-some
companies have locations at different sites to
serve users local to the site and global users e.g
headquarters
• Increased reliability and availability- reliability
refers to the probability that a system is up at a
particular moment, one site may fail but others
continue operating, availability refers to the
probability that the system is continuously
available during a time interval.
DDBMS
• Allowing data sharing while maintaining some
measure of local control
• Improved performance
DDBMS
Distribution leads to increased complexity in the
system design and implementation. To achieve
the potential advantages above the following
additional functions have to be provided:
• The ability to access remote sites and transmit
queries and data among the various sites via a
communication network
• The ability to keep track of the data distribution
and replication in the DBMS catalog
DDBMS
• The ability to devise execution strategies for
queries and transactions that access data from
more than one site
• The ability to decide on which copy of a
replicated data item to access
• The ability to maintain the consistency of copies
of a replicated data item
• The ability to recover from individual site crashes
and from new types of failures such as failure of a
communication link
DDBMS
In a typical DDBMS its common to divide the
software modules into three levels namely:
(a) The server software which is responsible for
local data management at a site much like a
centralised DBMS
(b) The client software which is responsible for
most of the distribution functions. It accesses
data distribution information from the DDBMS
catalog and processes all requests that require
access to more than one site
DDBMS
(c) The communication software (sometimes in
conjuction with a distributed OS) provides the
communication primitives that are used by
the client to transmit commands and data
among the various sites.
DDBMS
• The client is responsible for generating a
distributed execution plan for queries &
transactions, it ensures consistency of
replicated copies of the data item by
employing distributed concurrency control
techniques, performs global recovery when
certain sites fail
DDBMS
• And it hides the details of data distribution
from the user ( it enables the user to write
global queries and transactions as though the
database were centralised without specifying
the sites at which the data referenced in the
query resides). This property is called
distribution transparency.
DDBMS
Techniques used in distributed database design
include:
(a) Data fragmentation:
this is where decisions must be made regarding
which site should be used to store which
portions of the database. Before the decision
is made on how to distribute the data, the
logical units of the database that are to be
distributed are determined.
DDBMS
The two ways used are: (i) horizontal
fragmentation where a horizontal fragment of
a relation is a subset of tuples in that relation
e.g. we may store the database information
relating to each department at the computer
site of that department. For the relation
employee we define three horizontal
fragments by specifying a condition on an
attribute i.e. (DNO=4) (DNO=5) and (DNO=1).
DDBMS
(ii) Vertical fragmentation on the other hand
keeps only certain attributes of the relation
e.g. we may fragment the employee relation
into two-the 1st fragment includes personal
information and the 2nd fragment includes
work related information. Then we have
(iii) mixed fragmentation which intermixes
vertical and horizontal fragmentation.
DDBMS
(b) Data Replication and Allocation
if a fragment is stored at more than one site, it is
said to be replicated. Fully replicated is the most
extreme case in replication where the whole
database is replicated at every site in the
distributed system.
No replication is the other extreme where each
fragment is stored at exactly one site. Between
these two extremes we have a wide spectrum of
partial replication of the data
DDBMS
Each fragment or each copy of a fragment must
be assigned to a particular site in the
distributed system. This process is called data
allocation or data distribution.
The choice of sites and the degree of replication
depend on the performance and availability
goals of the system and on the types and
frequencies of transactions submitted at the
site.
DDBMS
• DDBMS differ in some ways which is
dependent on:
(i) The degree of homogeneity: if all servers use
identical software and all clients use identical
software the DDBMS is called homogeneous
otherwise its heterogeneous
• In a heterogeneous system, one server may be
a relational DBMS, another a network DBMS,
object oriented or hierarchical,
DDBMS
• in such a case it is necessary to have a
canonical system language and to include
language translators in the client to translate
sub-queries from the canonical language to
the language of each server.
DDBMS
(ii) The degree of distribution transparency:
In a DDBMS the cost of communication among
sites is considered a major factor in
distributed query optimization. The major
point in distributed query processing is use of
the semi-join operation which aims at
reducing the number of tuples in a relation
before transferring it to another site.
DDBMS
• DDBMS that support transparency employ query
decomposition which breaks up a query into sub-
queries that can be executed at individual sites, it
also determines the particular replica referenced
by a process called materializing a replica.
• For a vertical fragmentation the attribute list is in
the catalog and for a horizontal fragmentation a
condition is kept for each fragment.
Concurrency Control and Recovery for
Distributed Databases
In concurrency control and recovery for
distributed database the following factors are
specifically addressed:
• Distributed commit (two-phase commit)
• Distributed deadlock
• Failure of communication
• Failure of individual sites
• Dealing with multiple copies of data items
Concurrency Control and Recovery for
Distributed Databases
• A distributed transaction accesses data stored at more
than one location. Each transaction is divided into a
number of sub transactions one for each site that has
to be accessed. A sub transaction is represented as an
agent.
• Consider a transaction T that prints out the names of
all staff using the fragmentation schema (the user does
not need to know that data is fragmented, database
accesses are based on the global schema, so the user
does not need to specify fragment names or data
locations) as S1, S2, S21, S22 and S23.
Concurrency Control and Recovery for
Distributed Databases
• Three sub transactions TS3, TS5 and TS7
represent the agents at sites 3, 5 and 7
respectively. Each sub transaction prints out
the names of staff at that site.
Concurrency Control and Recovery for
Distributed Databases
Concurrency Control and Recovery for
Distributed Databases
• The transaction manager co-ordinates
transactions on behalf of application
programs, communicates with the scheduler
responsible for implementing a particular
strategy for concurrency control.
• In case of failure occurring during a
transaction the recovery manager ensures the
database is restored to the state it was in
before the start of the transaction
Concurrency Control and Recovery for
Distributed Databases
• It also restores the database to a consistent
state following system failure. The buffer
manager is responsible for the transfer of
data between disk storage and main memory.
• In a distributed DBMS, these modules still
exist in the local DBMS.
Concurrency Control and Recovery for
Distributed Databases
• In addition there is also a global transaction
manager or transaction co-ordinator at each
site, to co-ordinate the execution of both the
global and local transactions initiated at the
site.
• Inter-site communication is through the data
communication component.
Concurrency Control and Recovery for
Distributed Databases
The procedure to execute a global transaction
initiated at site S1 is as follows:
• The transaction co-ordinator TC1, at site S1
divides the transactions into a number of sub-
transactions using information held in the
global system catalog.
• The data communication component at site S1
sends the sub transactions to the appropriate
sites say S2 and S3.
Concurrency Control and Recovery for
Distributed Databases
• The transaction co-ordinators at sites S2 and S3
co-ordinate these sub transactions. The results of
the sub transactions are communicated back to
TC1, via the data communication component.
• Communication between different local
databases and different processors (at same site
or different sites takes place through message
passing in a communication network).
•
Concurrency Control and Recovery for
Distributed Databases
The key issues in concurrency control in
distributed DBMS
a. The degree of distribution of database
hardware/software and control determines
the complexity of a distributed system. The
degree of cooperation among the different
processors determines the inter computer
message rate and the complexity of control
Concurrency Control and Recovery for
Distributed Databases
(b)The distributed scheduler has to ensure the
consistency of different local databases in
which replication and multi-version data
objects may be used, at the same time it
needs to ensure the global consistency of the
whole collection of databases. Thus the
distributed scheduler is essentially a scheduler
of schedulers.
Concurrency Control and Recovery for
Distributed Databases
(c) In a distributed system no one site may hold all
the global information to ensure a global
consistency check. Hence one has to obtain
information about actions at different sites and
then combine all the information to obtain the
global information on consistency. We should
therefore have a central coordinator or assign the
coordination job to one site. Thus communication
costs between sites should be considered and
issues related to communication delays and
failures must be considered.
Concurrency Control and Recovery for
Distributed Databases
(d)The conflict graphs (the precedence graphs),
locks, timestamps and certifier techniques
depend on the fundamental notion of total
ordering of events in time. These techniques
must be extended in order to achieve a
distributed schedule.
Concurrency Control and Recovery for
Distributed Databases
• In a distributed schedule each transaction
performs actions at several sites, 1,2,3.....S.
the sequence of actions performed by a single
transaction at any one site is called a sub
transaction. The sequence of actions
performed by different transactions on a
database at any one site is called a local
schedule.
Concurrency Control and Recovery for
Distributed Databases
• Thus when many transactions are performing
their sub transactions at many sites, we have a
schedule of many local schedules. Hence to
ensure the serializability of a distributed
schedule it is necessary that each local
schedule be serializable though this may not
be a sufficient condition.
Concurrency Control and Recovery for
Distributed Databases
• In order to achieve a multisite global
serialization we need to combine the local
information obtained from different sites at a
co-ordinating site and look for multisite or
global consistency. Since it is clear that such a
compilation of information is communication
intensive, one should look for ways and means
of minimizing the communication overhead.
Concurrency Control and Recovery for
Distributed Databases
• Each of the techniques looked at in a
centralized site can be extended to suit a
distributed environment. They have however
different communication and computational
complexities.
Distributed Serializability
• The concept of serializability can be extended
for the distributed environment to cater for
data distribution. If the schedule of
transaction execution at each site is
serializable, then the global schedule (union of
all local schedules) is also serializable provided
local serialization orders are identical.
Distributed Serializability
• This requires that all sub-transactions appear
in the same order in the equivalent serial
schedule at all sites.
• Thus if the sub transaction of Ti at site S1 is
denoted as , it must be ensured that if

• Then
Distributed Serializability
for all sites at which and have sub
transactions.
Distributed Serializability
• The solutions to concurrency control in a
distributed environment are based on the two
main approaches of locking and time stamping
• Given a set of transactions to be executed
concurrently then:
(a) Locking guarantees that the concurrent
execution is equivalent to some
(unpredictable) serial execution of those
transactions.
Distributed Serializability
(b) Timestamping guarantees that the
concurrent execution is equivalent to a
specific serial execution of those transactions,
corresponding to the order of the timestamps.
Global Serializability Conditions
(a) At each site the local schedule is serializable
(b) At each site the serialization order of
transactions dictated by every other site is
not violated. That is for each pair of
conflicting actions among transactions ,
an action of precedes an action of in any
local schedule if and only if precedes in
the total ordering of ALL transactions at all
sites.
Global Serializability Conditions
• Example: the example below describes a
single version distributed schedule of two
transactions 1 and 2 at two sites 1 and 2 on
data objects X and Y
Global Serializability Conditions
• The pairs in the matrix are concurrent at different
sites 1 and 2
• For example 2yW1 1yR1 2xW1

• This means 1 and 2 cannot be ordered either as

1,2 or as 2,1 to bring in total order.

• Now consider the distributed schedule with the

local schedules

RDBMS Notes
No ratings yet
RDBMS Notes
136 pages
Unit 4 DBMS
No ratings yet
Unit 4 DBMS
15 pages
Ddbms Notes
No ratings yet
Ddbms Notes
21 pages
ADBMS
No ratings yet
ADBMS
84 pages
ch6 Distributed Database
No ratings yet
ch6 Distributed Database
25 pages
Unit 3
No ratings yet
Unit 3
62 pages
Unit - I Distributed Data Processing
100% (2)
Unit - I Distributed Data Processing
27 pages
4 Distributed Databases, NOSQL Systems, and BigData-1
No ratings yet
4 Distributed Databases, NOSQL Systems, and BigData-1
40 pages
Advantages of Distributed Database
No ratings yet
Advantages of Distributed Database
6 pages
Chapter-7 Distributed Database Systems
No ratings yet
Chapter-7 Distributed Database Systems
40 pages
Chapter 7 Distributed Database Systems
No ratings yet
Chapter 7 Distributed Database Systems
27 pages
DBBS Sheet
No ratings yet
DBBS Sheet
17 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
60 pages
DDB Unit 1-5
No ratings yet
DDB Unit 1-5
190 pages
5th Unit of RDBMS
No ratings yet
5th Unit of RDBMS
19 pages
Chapter 6
No ratings yet
Chapter 6
45 pages
Unit V NoSQL Databases
No ratings yet
Unit V NoSQL Databases
124 pages
Distributed Database Systems
No ratings yet
Distributed Database Systems
50 pages
Chapter 4 - Distributed Database System
No ratings yet
Chapter 4 - Distributed Database System
52 pages
Distributed Database-Chapter 3
No ratings yet
Distributed Database-Chapter 3
26 pages
Unit 1
No ratings yet
Unit 1
12 pages
Advanced Database Chapter 7 Assignment PDF
No ratings yet
Advanced Database Chapter 7 Assignment PDF
7 pages
Unit 1 - Scsa3008 - Distributed Database and Information
No ratings yet
Unit 1 - Scsa3008 - Distributed Database and Information
23 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
123 pages
ADS Chapter 7 Distributed Database
No ratings yet
ADS Chapter 7 Distributed Database
16 pages
ADBS Chapter Seven
No ratings yet
ADBS Chapter Seven
22 pages
DistributedDatabases 3
No ratings yet
DistributedDatabases 3
14 pages
Distributed Database System
No ratings yet
Distributed Database System
9 pages
Distributed Databases: Chapter 1: An Overview
No ratings yet
Distributed Databases: Chapter 1: An Overview
23 pages
DDB-distribution Database Important.
No ratings yet
DDB-distribution Database Important.
15 pages
Advanced Database Chapter 6 and 7
No ratings yet
Advanced Database Chapter 6 and 7
30 pages
DDBS Unit 1
No ratings yet
DDBS Unit 1
11 pages
Adb CH 4
No ratings yet
Adb CH 4
14 pages
Chapter 5 - Distributed Databases Roobera
No ratings yet
Chapter 5 - Distributed Databases Roobera
58 pages
Distributed Databases AND Client-Server Architechures
No ratings yet
Distributed Databases AND Client-Server Architechures
73 pages
Final
No ratings yet
Final
46 pages
Feature of Distributed Database
No ratings yet
Feature of Distributed Database
28 pages
Distributed Multimedia & Database System
No ratings yet
Distributed Multimedia & Database System
58 pages
Distributed Databases: Not Just A Client/server System
No ratings yet
Distributed Databases: Not Just A Client/server System
43 pages
Distributed Database System
No ratings yet
Distributed Database System
4 pages
DDBS Lec1
No ratings yet
DDBS Lec1
20 pages
Distributed Databases: CMP-3440 - Database Systems
No ratings yet
Distributed Databases: CMP-3440 - Database Systems
12 pages
Practical No. 1: Aim: Study About Distributed Database System. Theory
No ratings yet
Practical No. 1: Aim: Study About Distributed Database System. Theory
22 pages
DBMS-Unit 5
No ratings yet
DBMS-Unit 5
27 pages
Distributed Database I
No ratings yet
Distributed Database I
20 pages
Seminar Report
No ratings yet
Seminar Report
34 pages
14 Distributed DBMSs
No ratings yet
14 Distributed DBMSs
14 pages
Module 1
No ratings yet
Module 1
24 pages
Tybca Recent Trends in It Chpter 1
No ratings yet
Tybca Recent Trends in It Chpter 1
16 pages
Distributed Database Concepts
No ratings yet
Distributed Database Concepts
52 pages
DISTRIBUTED DATABASES Presentation
No ratings yet
DISTRIBUTED DATABASES Presentation
13 pages
Distributed Database Systems Overview
No ratings yet
Distributed Database Systems Overview
22 pages
Distributed Database
100% (1)
Distributed Database
24 pages
1 DDBMS Introduction
No ratings yet
1 DDBMS Introduction
18 pages
Distributed Database System
No ratings yet
Distributed Database System
15 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
Installation and Configuration Guide For The ILM Store
No ratings yet
Installation and Configuration Guide For The ILM Store
52 pages
Multimedia Mining Presentation
No ratings yet
Multimedia Mining Presentation
18 pages
SQL Notes - Practice Questions - Interview Questions
100% (1)
SQL Notes - Practice Questions - Interview Questions
60 pages
400 Questions
No ratings yet
400 Questions
11 pages
Auto-Scaling With Microsoft Fabric Capacity For Power BI
No ratings yet
Auto-Scaling With Microsoft Fabric Capacity For Power BI
6 pages
Distributed DB
No ratings yet
Distributed DB
4 pages
Introduction To PPDM 3.9: To Be Released Late 2011
No ratings yet
Introduction To PPDM 3.9: To Be Released Late 2011
19 pages
RDBMS Using MYSQL
No ratings yet
RDBMS Using MYSQL
2 pages
CCS334 Big Data Analytics
No ratings yet
CCS334 Big Data Analytics
20 pages
1Z0 448 Demo
No ratings yet
1Z0 448 Demo
5 pages
PROG
No ratings yet
PROG
11 pages
Chapter 6 Management Information System
No ratings yet
Chapter 6 Management Information System
6 pages
Lecture 4
No ratings yet
Lecture 4
46 pages
Data Storage and Querying
No ratings yet
Data Storage and Querying
2 pages
SQL IQ
No ratings yet
SQL IQ
26 pages
SA Seminar 01
No ratings yet
SA Seminar 01
46 pages
r4r Co in
No ratings yet
r4r Co in
179 pages
Resume - Aalok Kumar
No ratings yet
Resume - Aalok Kumar
4 pages
Query Processing and Optimization PDF
No ratings yet
Query Processing and Optimization PDF
73 pages
Hibernate Query Language: by Raghu Sir (Naresh It, Hyd)
No ratings yet
Hibernate Query Language: by Raghu Sir (Naresh It, Hyd)
10 pages
SQL 1
No ratings yet
SQL 1
12 pages
BPMN2 0 Poster NL PDF
No ratings yet
BPMN2 0 Poster NL PDF
1 page
SQL SYLLABUS JKJ Techno
No ratings yet
SQL SYLLABUS JKJ Techno
4 pages
Differences Between Oracle and MySQL
No ratings yet
Differences Between Oracle and MySQL
2 pages
JDBC Pbo
No ratings yet
JDBC Pbo
18 pages
Siamese Neural Networks For Content Base
No ratings yet
Siamese Neural Networks For Content Base
5 pages
Db2 Cert6113 PDF
No ratings yet
Db2 Cert6113 PDF
15 pages
Lab-1 and 2 Solution Qazi Mujtaba
No ratings yet
Lab-1 and 2 Solution Qazi Mujtaba
8 pages
SELECT INITCAP (Lastname - ',' - Firstname) AS "NAME" FROM Employees WHERE Job - Id 'AD - PRES' OR Job - Id 'IT - PROG'
No ratings yet
SELECT INITCAP (Lastname - ',' - Firstname) AS "NAME" FROM Employees WHERE Job - Id 'AD - PRES' OR Job - Id 'IT - PROG'
4 pages
Database System With Administration: Technical Assessment
No ratings yet
Database System With Administration: Technical Assessment
3 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
From Everand
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Siebel Remote Administration 8 Blackbook
From Everand
Siebel Remote Administration 8 Blackbook
Mohammed Azizuddin Aamer
No ratings yet

DDBMS

Uploaded by

DDBMS

Uploaded by

DDBMS

• In a centralised database system, all system

• This means 1 and 2 cannot be ordered either as

• Now consider the distributed schedule with the

You might also like