0% found this document useful (0 votes)

21 views32 pages

Distributed Database Systems Guide

The document provides an overview of Distributed Database Systems (DDBS), highlighting their integration of database and computer network technologies. It discusses the benefits of DDBS, such as scalability, fault tolerance, and improved performance, as well as architectural models like client/server and peer-to-peer. Additionally, it covers distributed query processing, design strategies, and fragmentation rules essential for effective DDBS implementation.

Uploaded by

katiavilma97

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views32 pages

Distributed Database Systems Guide

Uploaded by

katiavilma97

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

‫مالحظة‪ :‬تجدون فصال كامال خاص بهذا العرض‬

‫في مطوية الدروس على منصة موودل‬

‫‪Distributed‬‬
‫‪Databases‬‬
‫‪RABAH MOKHTARI‬‬
Introduction
Distributed database system (DDBS) technology is the union of two
approaches to data processing: database system and computer network
technologies.
1- Database systems
Database systems have taken us from a paradigm of data processing in
which each application defined and maintained its own data to one in which
the data are defined and administered centrally.
2- Computer network technologies
The technology of computer networks, on the other hand, promotes a mode of
work that goes against all centralization efforts.

2
Distributed Data Processing
 Distributed data processing is a computing model in which data
processing is distributed across multiple computers or nodes in a
network.

 The processing can be done in parallel, allowing for faster and more
efficient processing of large amounts of data.

 Each node in the network has access to a subset of the data, and the
nodes work together to process the data and generate the desired
output.

3
Distributed Database system
 A distributed database system is a type of database system that is spread
across multiple computers geographically distributed.

 In a distributed database system, the data is partitioned or replicated

across multiple nodes, and the nodes work together to process queries and
transactions from clients.

 A DDBS is also not a system where, despite the existence of a network, the
database resides at only one node of the network.

4
Distributed Database system

5
DDBS benefits
 Scalability: Distributed database systems can scale horizontally by adding
more nodes to the network. This allows the system to handle large volumes
of data and high transaction rates.

 Fault tolerance: Distributed database systems can continue to operate

even if one or more nodes fail. Data can be replicated across multiple nodes,
so if one node fails, another node can take over without loss of data.

 Improved performance: By distributing the data and processing across

multiple nodes, distributed database systems can improve performance by
processing queries and transactions in parallel.

6
Distributed DBMS architecture
 The architecture of a system defines its structure.

This means that the components of the system are identified, the
function of each component is specified, and the interrelationships
and interactions among these components are defined.

 The specification of the architecture of a system requires

identification of the various modules, with their interfaces and
interrelationships, in terms of the data and control flow through the
system.

7
ANSI/SPARC Architecture
 ANSI/SPARC Architecture is an early milestone in the field of database
systems

 It was developed by the American National Standards Institute (ANSI) and

the Standards Planning and Requirements Committee (SPARC) in the 1970s,
when the field of database management was still in its early stages.

 It helped to establish many of the fundamental concepts and principles that

are still used today.

The ANSI/SPARC architecture defines three levels of abstraction for a

database system

8
ANSI/SPARC Architecture

9
ANSI/SPARC Architecture
 External level: It describes how data is viewed by different users and
groups, and how data is accessed and manipulated by applications. Each
external schema is tailored to meet the specific needs of a particular user or
application.

 Conceptual level: This is the level of the database system that describes
the overall logical structure of the database. The conceptual schema is
independent of any particular application or user, and is used to ensure that
all data in the database is consistent and integrated

 Internal level: This is the level of the database system that describes how
data is physically stored and accessed by the computer system. It defines the
storage structures and access methods used by the DBMS to manage the
data. 10
Architectural Models for
Distributed DBMSs
The ways in which a distributed DBMS can be architected can be classified in
terms of: the autonomy of local systems, their distribution, and their
heterogeneity.

11
Architectural Models for
Distributed DBMSs
Autonomy
Autonomy refers to the distribution of control, not of data. It indicates the
degree to which individual DBMSs can operate independently.
 The local operations of the individual DBMSs are not affected by their
participation in the distributed system.
 The manner in which the individual DBMSs process queries and optimize
them should not be affected by the execution of global queries that access
multiple databases.
 System consistency or operation should not be compromised when
individual DBMSs join or leave the distributed system.

12
Architectural Models for
Distributed DBMSs
Distribution
 Distribution refers to the distribution of data over multiple sites.

 There are two alternatives classes: client/server distribution and peer-to-

peer distribution (or full distribution).

Heterogeneity
 Heterogeneity refers to the presence of diversity or differences in a
distributed database environment in terms of data models, query languages,
and transaction management protocols.

13
Client/Server architecture
 Client/server DBMSs entered the computing scene at the beginning of
1990s and have made a significant impact on both the DBMS technology and
the way we do computing.

 the functions are divided into two classes: server functions and client
functions.

 This provides a two-level architecture which makes it easier to manage the

complexity of modern DBMSs and the complexity of distribution.

 We can cite many examples of DDBMS that use client/server architecture of

distributed database systems. One such example is Microsoft SQL Server,
Oracle Database, MySQL and PostgreSQL.

14
Client/Server architecture

15
Peer-To-Peer architecture
 After a decade of popularity of client/server computing, peer-to-peer have
made a comeback in the last few years as an alternative to distributed
DBMSs.

 Apache Casandra DBMS represent a good example of peer-to-peer

DDBMS and makes use of an entirely peer-to-peer architecture.

 All nodes in a Cassandra cluster can accept reads and writes

16
Distributed query processing
 Distributed query processing is the process of executing a database query
that involves data stored on multiple nodes or servers in a distributed
database system.
When a query is submitted, it must be broken down into smaller subqueries
that can be executed on different nodes in parallel.
 The results must be combined to form the final result set.
 Distributed query processing involves several steps, including query
optimization, query decomposition, data fragmentation and
distribution, data transfer, local processing, and result consolidation.

17
Distributed query processing
The goal of distributed query processing is to minimize the amount of data
that needs to be transferred between nodes and to maximize parallelism in
the execution of subqueries in order to improve query performance .

Query processing problem

 The main function of a relational query processor is to transform a high-
level query (typically, in relational calculus) into an equivalent lower-level
query (typically, in some variation of relational algebra).
 The low-level query actually implements the execution strategy for the
query and The transformation must achieve both correctness and
efficiency.

18
Distributed query processing
Query processing problem
 The main function of a relational query processor is to transform a high-
level query (typically, in relational calculus) into an equivalent lower-level
query (typically, in some variation of relational algebra).
 The low-level query actually implements the execution strategy for the
query and The transformation must achieve both correctness and
efficiency.
Since each equivalent execution strategy can lead to very different
consumptions of computer resources, the main difficulty is to select the
execution strategy that minimizes resource consumption.

19
Distributed query processing
Query processing problem (Example)

20
Distributed query processing
Query processing problem (Example)
following simple user query: “Find the names of employees who are managing a
project”.

The expression of the query in relational calculus using the SQL syntax is

21
Distributed query processing
Query processing problem (Example 1)
Two equivalent relational algebra queries that are correct transformations of the
query above are:

It is intuitively obvious that the second query, which avoids the Cartesian
product of EMP and ASG, consumes much less computing resources than the
first, and thus should be retained.

22
Distributed query processing
Query processing problem
 In a centralized context, query execution strategies can be well expressed in an
extension of relational algebra
 The main role of a centralized query processor is to choose, for a given query,
the best relational algebra query among all equivalent ones.
 In a distributed system, relational algebra is not enough to express execution
strategies. It must be supplemented with operators for exchanging data between
sites
 In addition to the relational algebra operators, the distributed query processor
must also select the best sites to process data, and possibly the way data should
be transformed.

23
Distributed query processing
Query processing problem (Example 2)
 We consider the following query

 We assume that relations EMP and ASG are horizontally fragmented as follows

24
Distributed query processing
Query processing problem (Example 2)
 Fragments ASG1, ASG2, EMP1, and EMP2 are stored at sites 1, 2, 3, and 4,
respectively and the result is expected at site 5.
 Two equivalent distributed execution strategies for the above query are possibles.

25
Distributed database design
In the design of a distributed DBMSs, the distribution of applications involves
two things
 The distribution of the distributed DBMS software, and
 The distribution of the application programs that run on it

Two major strategies that have been identified for designing distributed
databases
The top-down approach and the bottom-up approach

26
Distributed database design
Top-down approach

27
Distributed database design
Distribution design

28
Fragmentation alternatives
Vertical and horizontal fragmentation

29
Correctness Rules of
Fragmentation
Completeness

30
Correctness Rules of
Fragmentation
Reconstruction

31
Correctness Rules of
Fragmentation
Disjointness

Common questions

Distributed database systems offer several key benefits over centralized systems, including scalability, fault tolerance, and improved performance. These systems can scale horizontally by adding more nodes, allowing them to handle large data volumes and high transaction rates effectively. They exhibit fault tolerance, as data is replicated across multiple nodes, enabling the system to continue operation despite node failures. Additionally, by distributing data and processing across nodes, distributed database systems can perform queries and transactions in parallel, which enhances overall performance .

Distributed query processing involves executing queries across multiple nodes, which presents challenges such as data fragmentation, distribution, and efficient data transfer. These challenges can be addressed by leveraging optimized query decomposition to break down large queries into manageable subqueries, maximizing data locality and minimizing data transfer. Additionally, the use of advanced algorithms for query optimization and parallel processing strategies can help address resource consumption and execution efficiency issues .

Client/server architecture has significantly impacted DBMS technology by streamlining the management of resources and services through a two-level division of client and server functions. This structure facilitated efficient data handling and processing by offloading intensive tasks to servers, allowing clients to remain lightweight. The architecture improved system manageability and adaptability, influencing the design of modern databases such as Oracle, MySQL, and PostgreSQL .

Autonomy in a distributed DBMS refers to the degree to which individual database management systems operate independently. It indicates that local operations are unaffected by their participation in the distributed environment. This autonomy is vital because it ensures that local DBMS operations, such as query processing and optimization, proceed without being influenced by global queries accessing multiple databases, thereby maintaining system consistency and performance even when individual DBMSs join or leave the distributed network .

The ANSI/SPARC model, although developed in the 1970s, significantly influenced modern database systems by establishing foundational concepts and principles used today. Its three-level abstraction—external, conceptual, and internal—has shaped database design by supporting data independence, optimizing data access methods, and accommodating diverse application requirements. This model's emphasis on structured data organization paved the way for developing sophisticated DBMS architectures that are both flexible and scalable .

Scalability is a key advantage of distributed database systems because it allows them to manage growth in data volume and transaction rates efficiently. This is typically achieved through horizontal scaling, where additional nodes are added to the network to distribute the workload, facilitating balanced data processing and storage across the system. This prevents bottlenecks and enables the system to handle increased demands with relative ease .

The ANSI/SPARC architecture describes three levels of abstraction for a database system. The external level defines how data is viewed and accessed by users and applications, with each external schema tailored to specific needs. The conceptual level depicts the overall logical structure of the database, ensuring data consistency and integration, while remaining independent of particular applications or users. The internal level details how data is stored physically and accessed by the system, defining storage structures and access methods used by the DBMS .

In peer-to-peer architecture, every node in the network is equal and can assume the role of client and server, enabling direct interactions without centralized control, which allows for greater scalability and resilience to single points of failure. Conversely, client/server architecture divides functions into server functions (which manage resources and provide services) and client functions (which request services), offering a structured and manageable system but potentially becoming a bottleneck if the server is overburdened .

Vertical fragmentation involves dividing a database table into smaller tables with subsets of columns, while horizontal fragmentation involves dividing tables into subsets of rows. Correctness rules of fragmentation, such as completeness, ensure that the original data can be reconstructed from fragments without loss or duplication, maintaining data integrity across distributed environments .

Fault tolerance enhances system reliability in distributed databases by allowing continuous operation despite node failures. This is supported through data replication across multiple nodes, ensuring that if one node fails, another can seamlessly take over without data loss. Redundant data storage and failover processes are key mechanisms that sustain database accessibility and integrity under adverse conditions, thereby providing robust reliability .

Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
Lecture3-Distributed Introduction
No ratings yet
Lecture3-Distributed Introduction
38 pages
Adbms Chapter 7 Ddbms
No ratings yet
Adbms Chapter 7 Ddbms
73 pages
Chapter - 6 Distributed Database System
No ratings yet
Chapter - 6 Distributed Database System
50 pages
Adv DB@Chap 4 S
No ratings yet
Adv DB@Chap 4 S
29 pages
Advanced Data Base Management Systems
No ratings yet
Advanced Data Base Management Systems
35 pages
DD Decode
0% (1)
DD Decode
104 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
23 pages
DDBS Lec1
No ratings yet
DDBS Lec1
20 pages
Lec1 30 9 16
No ratings yet
Lec1 30 9 16
32 pages
Chapter - 7 Distributed Database System
100% (1)
Chapter - 7 Distributed Database System
54 pages
Advance Concept in Data Bases Unit-3 by Arun Pratap Singh
100% (2)
Advance Concept in Data Bases Unit-3 by Arun Pratap Singh
81 pages
Iii. Current Trends: Distributed Databases and DBMSS: Concepts and Design
No ratings yet
Iii. Current Trends: Distributed Databases and DBMSS: Concepts and Design
32 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
33 pages
Distributed Database Systems Guide
0% (1)
Distributed Database Systems Guide
54 pages
1 Distributed DB
No ratings yet
1 Distributed DB
67 pages
Unit - 1 DDB
No ratings yet
Unit - 1 DDB
34 pages
Distributed Database Systems Guide
No ratings yet
Distributed Database Systems Guide
24 pages
Ddbms-Unit 1 Part2
No ratings yet
Ddbms-Unit 1 Part2
16 pages
Distributed DBMS Architecture
No ratings yet
Distributed DBMS Architecture
49 pages
Distribution Database
No ratings yet
Distribution Database
52 pages
Topic 7 - Distributed Database Systems
No ratings yet
Topic 7 - Distributed Database Systems
44 pages
Distributed Database Systems
No ratings yet
Distributed Database Systems
50 pages
Types of Distributed Data Base System - 49724
No ratings yet
Types of Distributed Data Base System - 49724
37 pages
MC4202 - Adavanced Database Technology
No ratings yet
MC4202 - Adavanced Database Technology
159 pages
NoSQL & Distributed Databases Overview
No ratings yet
NoSQL & Distributed Databases Overview
124 pages
Distributeddbms Er. Inderjeet Bal
No ratings yet
Distributeddbms Er. Inderjeet Bal
60 pages
Understanding Distributed Query Processing
No ratings yet
Understanding Distributed Query Processing
19 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
Principles of Distributed Database Systems
No ratings yet
Principles of Distributed Database Systems
19 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
73 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
63 pages
Distributed Database
100% (1)
Distributed Database
24 pages
Lec 10 Distributed Databases System
No ratings yet
Lec 10 Distributed Databases System
34 pages
Chapter-7 Distributed Database Systems
No ratings yet
Chapter-7 Distributed Database Systems
40 pages
Distributed Database Management Guide
No ratings yet
Distributed Database Management Guide
55 pages
Distributed Database Essentials
No ratings yet
Distributed Database Essentials
26 pages
Distributed System Upto 3module
No ratings yet
Distributed System Upto 3module
47 pages
Subject: Dds (512) Distributed Data Processing
No ratings yet
Subject: Dds (512) Distributed Data Processing
12 pages
7-Distributed DB
No ratings yet
7-Distributed DB
37 pages
Parallel Databases
No ratings yet
Parallel Databases
23 pages
Database II: Distributed Databases
No ratings yet
Database II: Distributed Databases
15 pages
Topic 7 DDBMS
No ratings yet
Topic 7 DDBMS
28 pages
10-Distributed Databases Lecturer 3 Best
No ratings yet
10-Distributed Databases Lecturer 3 Best
55 pages
Chapter 6 Distributed System Management
No ratings yet
Chapter 6 Distributed System Management
12 pages
Chapter - 7 Distributed Database System
No ratings yet
Chapter - 7 Distributed Database System
58 pages
Distributed Databases: An Overview: Unit-1
No ratings yet
Distributed Databases: An Overview: Unit-1
42 pages
Parallel and Distributed Databases
No ratings yet
Parallel and Distributed Databases
7 pages
CSE 453 Slide 1
No ratings yet
CSE 453 Slide 1
46 pages
Advanced Distributed Databases
100% (1)
Advanced Distributed Databases
20 pages
Distributed DBMS for IT Professionals
No ratings yet
Distributed DBMS for IT Professionals
46 pages
Unit-4 Ch-12
No ratings yet
Unit-4 Ch-12
15 pages
Chapter 7
No ratings yet
Chapter 7
22 pages
Distributeddatabase
No ratings yet
Distributeddatabase
27 pages
Unit - I Distributed Data Processing
100% (5)
Unit - I Distributed Data Processing
27 pages
Universal VRF HVAC Integration Device
No ratings yet
Universal VRF HVAC Integration Device
2 pages
Vb-Audio Cable: Configuring Vb-Cable System Settings Hifi-Cable & Asio Bridge
No ratings yet
Vb-Audio Cable: Configuring Vb-Cable System Settings Hifi-Cable & Asio Bridge
8 pages
Vlsi Project List
No ratings yet
Vlsi Project List
4 pages
Homemade Circuit Projects: RC4 Wireless DMX Lighting
No ratings yet
Homemade Circuit Projects: RC4 Wireless DMX Lighting
12 pages
Chalkak Product Brochure
No ratings yet
Chalkak Product Brochure
18 pages
Simplex - 4009 IDNet NAC Extender - Manual
No ratings yet
Simplex - 4009 IDNet NAC Extender - Manual
34 pages
Feb-Mar 2022 20ECS333
No ratings yet
Feb-Mar 2022 20ECS333
1 page
Electronics Lab: Filter Design Guide
No ratings yet
Electronics Lab: Filter Design Guide
17 pages
Potimized Spark Vital Modification
No ratings yet
Potimized Spark Vital Modification
9 pages
AQH3223
No ratings yet
AQH3223
4 pages
OPS I5 10th Gen Type-C Datasheet (June23)
No ratings yet
OPS I5 10th Gen Type-C Datasheet (June23)
1 page
Altus Aps Nr2
No ratings yet
Altus Aps Nr2
1 page
10.1" HDMI LCD User Manual
No ratings yet
10.1" HDMI LCD User Manual
6 pages
IoT Programming in C: PG Diploma Overview
No ratings yet
IoT Programming in C: PG Diploma Overview
2 pages
5: Exercise On FC SAN Design: Scenario
100% (2)
5: Exercise On FC SAN Design: Scenario
1 page
Product Data Sheet Backup Recovery Deltav en 56972
No ratings yet
Product Data Sheet Backup Recovery Deltav en 56972
10 pages
SPINE Manual Version 1.3 Overview
No ratings yet
SPINE Manual Version 1.3 Overview
69 pages
Ihe Pub Pscale Leaf Spine Ig
No ratings yet
Ihe Pub Pscale Leaf Spine Ig
24 pages
CL-MTP58B User Manual
No ratings yet
CL-MTP58B User Manual
31 pages
eDXC RTB v4 M
100% (1)
eDXC RTB v4 M
115 pages
InformationSheet EEX5567-EEI5567 2020 2021
No ratings yet
InformationSheet EEX5567-EEI5567 2020 2021
4 pages
An Introduction To Basic Ladder Logic Instructions in Siemens Tia Portal 2
No ratings yet
An Introduction To Basic Ladder Logic Instructions in Siemens Tia Portal 2
1 page
I/O Interfacing in 8085 Microprocessor
100% (2)
I/O Interfacing in 8085 Microprocessor
20 pages
Matrix Cosec Aebas Brochure
No ratings yet
Matrix Cosec Aebas Brochure
4 pages
Huawei ICT Competition 2025 Overview
No ratings yet
Huawei ICT Competition 2025 Overview
1 page
MOB203 - Blue Prints For Enterprise Mobility: Public
100% (1)
MOB203 - Blue Prints For Enterprise Mobility: Public
53 pages
What Is A Software Requirement
No ratings yet
What Is A Software Requirement
6 pages
F-Secure Linux Security Overview
No ratings yet
F-Secure Linux Security Overview
67 pages
SYS600 - Installation and Administration
No ratings yet
SYS600 - Installation and Administration
160 pages
VirtualBox Red Hat 7.7 Setup Guide
No ratings yet
VirtualBox Red Hat 7.7 Setup Guide
12 pages

Distributed Database Systems Guide

Uploaded by

Distributed Database Systems Guide

Uploaded by

‫مالحظة‪ :‬تجدون فصال كامال خاص بهذا العرض‬

‫في مطوية الدروس على منصة موودل‬

 In a distributed database system, the data is partitioned or replicated

 Fault tolerance: Distributed database systems can continue to operate

 Improved performance: By distributing the data and processing across

 The specification of the architecture of a system requires

 It was developed by the American National Standards Institute (ANSI) and

 It helped to establish many of the fundamental concepts and principles that

The ANSI/SPARC architecture defines three levels of abstraction for a

 There are two alternatives classes: client/server distribution and peer-to-

 This provides a two-level architecture which makes it easier to manage the

 We can cite many examples of DDBMS that use client/server architecture of

 Apache Casandra DBMS represent a good example of peer-to-peer

 All nodes in a Cassandra cluster can accept reads and writes

Query processing problem

Common questions

What are the main benefits of using a distributed database system as opposed to a centralized database system?

What challenges are associated with distributed query processing, and how can these be addressed?

How has client/server architecture impacted DBMS technology since its rise in the 1990s?

What does the autonomy of local systems in a distributed DBMS entail, and why is it important?

Discuss how the ANSI/SPARC model has influenced modern database systems despite its age.

Why is scalability considered a significant advantage of distributed database systems, and how is it typically achieved?

Explain the differences between the external, conceptual, and internal levels of the ANSI/SPARC architecture in database systems.

How does peer-to-peer architecture in distributed database management systems differ from client/server architecture?

In the context of distributed databases, what are vertical and horizontal fragmentation, and what rule ensures data fragmentation correctness?

How does fault tolerance in distributed databases enhance system reliability and what mechanisms support this feature?

You might also like