Lec 10 Distributed Databases System

The document provides an overview of distributed database systems, covering architecture types such as client-server, peer-to-peer, and multi-tier, as well as management techniques like data replication and partitioning. It details key components of a Distributed Database Management System (DDBMS) and discusses distributed query processing and fault tolerance mechanisms. Challenges associated with distributed databases, including coordination complexity and consistency versus availability trade-offs, are also highlighted.

Uploaded by

mhariskhan513

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views34 pages

Lec 10 Distributed Databases System

Uploaded by

mhariskhan513

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Distributed Databases System

Topics Covered
⚫ Distributed Database Architecture
⚫ Distributed Database Management Techniques
⚫ Data Replication and Partitioning
⚫ Components of Distributed Database
⚫ Distributed Query Processing and Fault Tolerance
Distributed Database
⚫ A distributed database is a collection of multiple,
logically interrelated databases distributed over a
computer network.
⚫ It appears to users as a single database, but the data is
actually stored across multiple physical locations,
which may be geographically dispersed.
⚫ The management system that coordinates and provides
access to this database is called a Distributed Database
Management System (DDBMS).
Distributed Database Architecture
1. Client-Server Architecture
⚫ Description: The system is divided into two
roles—clients (request services) and servers (provide
services).
⚫ Example: Clients send SQL queries to a central server
which processes and returns the result.
⚫ Use Case: Common in traditional DBMS setups with
centralized processing.
Distributed Database Architecture
2. Peer-to-Peer (P2P) Architecture
⚫ Description: All nodes (sites) are equal; each can act as
both client and server.
⚫ Advantages: High availability, fault tolerance, and
scalability.
⚫ Example: Blockchain databases, some NoSQL
systems.
Distributed Database Architecture
3. Multi-Tier Architecture
⚫ Description: Involves three or more
layers—presentation (UI), application (logic), and data
(storage).
⚫ Advantages: Modularity, better security, scalability.
⚫ Use Case: Web applications and enterprise distributed
systems.
Distributed Database Architecture
4. Federated Architecture (Heterogeneous DDB)
⚫ Description: Independent databases are integrated
while retaining autonomy.
⚫ Types:
⚫ Loosely Coupled: Minimal coordination; suitable for
dynamic environments.
⚫ Tightly Coupled: Centralized control over global schema.
⚫ Example: A university integrates data from different
departments.
Distributed Database Architecture
5. Cluster-Based Architecture
⚫ Description: Multiple servers (nodes) work together as a
single database system.
⚫ Advantages: High performance, failover support.
⚫ Example:
1. Apache Cassandra: NoSQL database with a peer-to-peer
clustered architecture used by Netflix, Facebook, etc.
2. Oracle RAC (Real Application Clusters): Traditional
RDBMS with a clustered setup for high availability and
load balancing.
3. Google Spanner: globally distributed, clustered relational
database used internally by Google and also offered as a
cloud service. It supports SQL queries, strong
consistency, and horizontal scaling.
Data Fault Best Use
Architecture Centralized? Scalability
Control Tolerance Case
Small
Central
Client-Server Yes Moderate Low distributed
Server
systems
Decentraliz
ed systems
Peer-to-Peer No Distributed High High
(e.g., P2P
sharing)
Tiered (DB Enterprise
Multi-tier Yes High Moderate
in backend) applications
Integration
Independen
Federated No Moderate Moderate of multiple
t systems
DBs
High-availa
Shared/Part
Cluster-based Semi Very High Very High bility &
itioned
cloud DBs
Key Components in Architecture
⚫ Distribution Transparency:
⚫ Users perceive the database as a single logical entity, unaware
of the actual physical distribution.
⚫ Includes:
⚫ Location transparency: Users don’t need to know where the data
resides.
⚫ Replication transparency: Users are unaware of data replication.
⚫ Fragmentation transparency: Users don’t see the data fragmentation
(horizontal/vertical).
⚫ Data Independence:
⚫ Logical and physical data independence is maintained, similar
to centralized databases.
⚫ Autonomy:
⚫ Each site can control its own data, providing local autonomy.
Key Components in Architecture
⚫ Concurrency Control:
⚫ Multiple transactions can occur simultaneously at
different sites without conflicts.
⚫ Reliability and Availability:
⚫ Distributed systems are more fault-tolerant; if one site
fails, others can continue to operate.
⚫ Scalability:
⚫ Easier to expand the database system by adding more
sites.
Distributed Database Management
Techniques
1. Data Replication
Purpose: Improve data availability and fault tolerance.
Types:
⚫ Master-Slave: One node writes; others replicate.
⚫ Multi-Master: All nodes can write; need conflict resolution.
Benefits:
⚫ Improved read performance
⚫ Fault tolerance
⚫ Load balancing
Challenges:
⚫ Data consistency
⚫ Synchronization overhead
Distributed Database Management
Techniques
2. Data Partitioning (Sharding)
Definition: Splitting a large database into smaller, faster, more manageable parts
(shards).
Types:
⚫ Horizontal Partitioning: Divide rows.
⚫ Vertical Partitioning: Divide columns.
⚫ Range-based, Hash-based, List-based sharding.
Benefits:
⚫ Scalability
⚫ Faster query performance
⚫ Resource optimization
Challenges:
⚫ Cross-shard queries
⚫ Complex joins
⚫ Data rebalancing
Distributed Database Management
Techniques
3. Allocation
Definition: allocation refers to the strategy used to distribute data
fragments or entire databases across multiple sites or nodes in a
distributed database system (DDBS). The goal is to optimize
performance, availability, reliability, and cost.
Types:
⚫ Centralized Allocation: All data is stored at a single central site.
⚫ Partitioned (Fragmented) Allocation: Database is divided into
fragments (horizontal, vertical, or mixed) and each fragment is stored
at a different site.
⚫ Replicated Allocation: Copies of the same data are stored at multiple
sites.
Benefits:
⚫ Optimize Performance
⚫ Availability
⚫ Reliability
Distributed Database Management
Techniques
Challenges:
⚫ Data Redundancy and Consistency
⚫ Optimal Data Placement: where to allocate data
fragments to minimize access time, communication
cost, and storage cost.
⚫ Load Balancing
⚫ Network Latency and Failures
⚫ Dynamic Access Patterns
⚫ Scalability
Components of a Distributed DBMS
⚫ Transaction Manager: Ensures consistency and
ACID properties across sites.
⚫ Query Processor: Decomposes queries and routes
subqueries to appropriate sites.
⚫ Communication Manager: Manages communication
between sites.
⚫ Concurrency Control Manager: Ensures correct
concurrent transaction execution.
⚫ Recovery Manager: Handles failures and restores the
system.
Distributed Query Processing and Fault
Tolerance
⚫ Distributed Query Processing is the process of
decomposing a high-level user query into subqueries,
executing them at the appropriate remote sites, and
then assembling the results to present a unified answer
to the user.
⚫ DQP refers to the methods and techniques used to
process a user's database query in a distributed
database system
⚫ The goal is to execute queries efficiently by
minimizing communication cost, response time, and
resource usage while ensuring correctness and
completeness of results.
Phases of Distributed Query Processing
1. Query Decomposition:
⚫ The high-level SQL query is parsed and transformed
into a relational algebra or logical representation.
⚫ It is analyzed for syntactic and semantic correctness.
2. Data Localization:
⚫ Identify where the required data (relations/fragments)
is stored.
⚫ Convert logical relations into physical fragments based
on fragmentation and allocation information.
Phases of Distributed Query Processing
3. Query Optimization:
⚫ Generate multiple query execution plans (QEPs).
⚫ Select the most cost-effective plan based on:
⚫ Communication cost
⚫ Local processing cost
⚫ Data transfer time
⚫ Join strategies
4. Local Optimization and Execution:
⚫ Each subquery is sent to its corresponding site.
⚫ Local DBMSs optimize and execute their subqueries.
Phases of Distributed Query Processing
⚫ Result Assembly:
⚫ Subquery results are transferred to a coordinating site.
⚫ Final result is constructed (e.g., through joins, unions,
aggregations).
⚫ Output is returned to the user.
Example
Assume relation Employee is horizontally fragmented across
Site A and Site B:
SELECT name FROM Employee WHERE salary >
50000;

⚫ Query Decomposition: Break the query into:

SELECT name FROM Employee_A WHERE salary >
50000; SELECT name FROM Employee_B WHERE salary
> 50000;

⚫ Execution: Each subquery runs locally at Site A and Site

B.
⚫ Result Assembly: Results from both sites are merged and
returned to the user.
Fault Tolerence
⚫ Fault Tolerance in distributed databases refers to the
system's ability to continue functioning correctly even
when one or more components fail.
⚫ The main goal is to ensure data integrity, availability,
and system reliability, despite failures in hardware,
software, or the network.
⚫ It ensures that transactions are processed correctly, and
the system can recover automatically or with minimal
manual intervention.
Mechanisms to Achieve Fault Tolerance
1. Replication:
⚫ Data is stored at multiple sites.
⚫ If one site fails, another replica can serve the data.
⚫ Must maintain data consistency through synchronization.
2. Commit Protocols (for transaction atomicity):
⚫ Ensure that either all parts of a distributed transaction commit, or
none do.
⚫ Two-Phase Commit (2PC):
⚫ Phase 1: Coordinator asks all sites to prepare.
⚫ Phase 2: Based on responses, coordinator tells them to commit or
abort.
⚫ Three-Phase Commit (3PC):
⚫ Adds a "pre-commit" phase to reduce uncertainty in the event of
failure.
Mechanisms to Achieve Fault Tolerance
3. Logging and Recovery:
⚫ Write-ahead logs (WALs) record actions before they're executed.
⚫ After failure, logs are used to redo or undo transactions to ensure
consistency.
⚫ Checkpointing periodically saves system state to reduce
recovery time.
4. Failover and Redundancy:
⚫ Automatic switching to a standby system or site when a failure
occurs.
⚫ May involve active-passive (hot standby) or active-active (load
sharing) configurations.
5. Timeouts and Retry Mechanisms:
⚫ Detect failures by expecting timely responses.
⚫ Retry failed communications or redirect requests.
Example
A transaction to transfer money between accounts in two different
sites is in progress.
⚫ Site A successfully debits the amount.
⚫ Before Site B can credit the amount, Site B crashes.

Without fault tolerance:

Data inconsistency arises—money is lost.

With fault tolerance:

⚫ The system detects the failure.
⚫ Logs at Site A allow rollback (undo debit), or
⚫ System retries when Site B recovers, or
⚫ Uses a backup replica of Site B to complete the credit.
Challenges
⚫ Complex coordination among sites
⚫ Trade-off between consistency and availability (CAP
theorem)
⚫ Maintaining performance under failure scenarios
⚫ Cost of redundant hardware and data replication

Distributed Database Systems-Chhanda Ray
No ratings yet
Distributed Database Systems-Chhanda Ray
271 pages
Group Disc
No ratings yet
Group Disc
38 pages
Cellsecure E6 Product User Manual
100% (1)
Cellsecure E6 Product User Manual
35 pages
advanced database individual assignment
No ratings yet
advanced database individual assignment
4 pages
Chapter 6
No ratings yet
Chapter 6
28 pages
Traditional Network Architecture and SDN
No ratings yet
Traditional Network Architecture and SDN
9 pages
distributed system upto 3Module
No ratings yet
distributed system upto 3Module
47 pages
DBMS
No ratings yet
DBMS
10 pages
Distributed DBMS
No ratings yet
Distributed DBMS
62 pages
365372400R5.1 - V1 - Alcatel-Lucent 1850 TSS-5
No ratings yet
365372400R5.1 - V1 - Alcatel-Lucent 1850 TSS-5
474 pages
DDBMS (3,4 & 14)
No ratings yet
DDBMS (3,4 & 14)
11 pages
Sample Computer Practical File 12
No ratings yet
Sample Computer Practical File 12
130 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
23 pages
Distributed Databases
No ratings yet
Distributed Databases
32 pages
ch6 Distributed Database
No ratings yet
ch6 Distributed Database
25 pages
Question Bank Solved
No ratings yet
Question Bank Solved
11 pages
CSE 453 Slide 1
No ratings yet
CSE 453 Slide 1
46 pages
Casemon 1
No ratings yet
Casemon 1
101 pages
DDB.NOTES
No ratings yet
DDB.NOTES
19 pages
Unit 5
No ratings yet
Unit 5
21 pages
Ddbms-unit 1 Part2
No ratings yet
Ddbms-unit 1 Part2
16 pages
Distributed_Databases_Explained_Detailed
No ratings yet
Distributed_Databases_Explained_Detailed
4 pages
Iii. Current Trends: Distributed Databases and DBMSS: Concepts and Design
No ratings yet
Iii. Current Trends: Distributed Databases and DBMSS: Concepts and Design
32 pages
Ddbms Long Only
No ratings yet
Ddbms Long Only
53 pages
Distibuted System
No ratings yet
Distibuted System
11 pages
Distributed Database
No ratings yet
Distributed Database
9 pages
Unit - I Distributed Data Processing
100% (2)
Unit - I Distributed Data Processing
27 pages
Session 2
No ratings yet
Session 2
58 pages
Presentation
No ratings yet
Presentation
19 pages
ddb unit 1-5
No ratings yet
ddb unit 1-5
190 pages
Chapter 4 - Distributed Database System
No ratings yet
Chapter 4 - Distributed Database System
52 pages
Distributeddbms Er. Inderjeet Bal
No ratings yet
Distributeddbms Er. Inderjeet Bal
60 pages
DDBMS
No ratings yet
DDBMS
14 pages
Chapter-7 Distributed Database Systems
No ratings yet
Chapter-7 Distributed Database Systems
40 pages
UNIT- 1 DDB
No ratings yet
UNIT- 1 DDB
34 pages
1
No ratings yet
1
2 pages
Pathavali Gujarati
No ratings yet
Pathavali Gujarati
144 pages
Midterm Elective Database Notes
No ratings yet
Midterm Elective Database Notes
14 pages
Types of Distributed Data Base System_49724
No ratings yet
Types of Distributed Data Base System_49724
37 pages
1.2
No ratings yet
1.2
2 pages
Distributed Databases: Benefits and Issues To Be Considered
No ratings yet
Distributed Databases: Benefits and Issues To Be Considered
25 pages
ADBMS Notes 3
No ratings yet
ADBMS Notes 3
9 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
60 pages
AdvDB@Chap4s
No ratings yet
AdvDB@Chap4s
29 pages
Distributed Systems
No ratings yet
Distributed Systems
25 pages
Chapter - 7 Distributed Database System
0% (1)
Chapter - 7 Distributed Database System
54 pages
Lefikir PowerPoint
No ratings yet
Lefikir PowerPoint
15 pages
DB unit-2
No ratings yet
DB unit-2
27 pages
Advanced Data Base Management Systems
No ratings yet
Advanced Data Base Management Systems
35 pages
10 Distributeddbms
No ratings yet
10 Distributeddbms
56 pages
ABB in The Solar Inverter Space: Enabling The Digital Grid With A Renewing Solution Portfolio
No ratings yet
ABB in The Solar Inverter Space: Enabling The Digital Grid With A Renewing Solution Portfolio
53 pages
Chapter 5 - Distributed Databases Roobera
No ratings yet
Chapter 5 - Distributed Databases Roobera
58 pages
Distrubuted Database Concept
No ratings yet
Distrubuted Database Concept
22 pages
CS3492-DBMS unit-5
No ratings yet
CS3492-DBMS unit-5
9 pages
ADBS_Chapter_Seven
No ratings yet
ADBS_Chapter_Seven
22 pages
Advanced Database Chapter 6 and 7
No ratings yet
Advanced Database Chapter 6 and 7
30 pages
DDBS Unit 1
No ratings yet
DDBS Unit 1
11 pages
Topic 7 DDBMS
No ratings yet
Topic 7 DDBMS
28 pages
Regression Analysis Assignment
No ratings yet
Regression Analysis Assignment
8 pages
Uncheck MLBB 3 Oktober 2022
No ratings yet
Uncheck MLBB 3 Oktober 2022
9 pages
explanation
No ratings yet
explanation
3 pages
Assignment 1 Final
No ratings yet
Assignment 1 Final
52 pages
Uniflair LE Chilled Water Service Manual ERIN-9QSR7N - R1 - EN PDF
100% (1)
Uniflair LE Chilled Water Service Manual ERIN-9QSR7N - R1 - EN PDF
88 pages
Distributed DB
No ratings yet
Distributed DB
16 pages
Tybca Recent Trends in It Chpter 1
No ratings yet
Tybca Recent Trends in It Chpter 1
16 pages
UNIT 4 (File Handling and Exception Handling)
No ratings yet
UNIT 4 (File Handling and Exception Handling)
15 pages
Algorithm Design Techniques - 1556432967209
No ratings yet
Algorithm Design Techniques - 1556432967209
8 pages
Module 1
No ratings yet
Module 1
24 pages
Advanced Operation Research Paper
No ratings yet
Advanced Operation Research Paper
3 pages
DDB Slides
No ratings yet
DDB Slides
30 pages
Highlights & Data Sheet: Signotec Sigma
No ratings yet
Highlights & Data Sheet: Signotec Sigma
5 pages
Distributed Database
100% (1)
Distributed Database
24 pages
Differentiate Copy and Move - Google Search
No ratings yet
Differentiate Copy and Move - Google Search
1 page
Chapter - 7 Distributed Database System
100% (1)
Chapter - 7 Distributed Database System
54 pages
Stewart Calculus Homework Solutions
100% (1)
Stewart Calculus Homework Solutions
6 pages
Unit VI Mediated Communication and Its Impact Personal Relationships
No ratings yet
Unit VI Mediated Communication and Its Impact Personal Relationships
16 pages
Electrical Impedance Tomography (EIT) and Its Medical Applications - A Review PDF
No ratings yet
Electrical Impedance Tomography (EIT) and Its Medical Applications - A Review PDF
6 pages
Answer Sheet of C++ S5 CSC
No ratings yet
Answer Sheet of C++ S5 CSC
5 pages
Three Wideband Monopolar Patch Antennas in a Y-Shape Structure for 5G Multi-Input–Multi-Output Access Points
No ratings yet
Three Wideband Monopolar Patch Antennas in a Y-Shape Structure for 5G Multi-Input–Multi-Output Access Points
5 pages
Albert Gräf Dept. of Music Informatics: Interfacing PD With Faust
No ratings yet
Albert Gräf Dept. of Music Informatics: Interfacing PD With Faust
8 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
Distributed Database Systems (DDBS)
No ratings yet
Distributed Database Systems (DDBS)
30 pages
Practical Index PDF
No ratings yet
Practical Index PDF
2 pages
Frontend Projects
No ratings yet
Frontend Projects
3 pages
Nightfall AI 2
No ratings yet
Nightfall AI 2
2 pages
FONTLOG
No ratings yet
FONTLOG
2 pages
Epson L3252 Brochure PDF
No ratings yet
Epson L3252 Brochure PDF
2 pages
ROZA
No ratings yet
ROZA
2 pages
4.-Revised-Tle-As-Css10-Q3-Disk Management
No ratings yet
4.-Revised-Tle-As-Css10-Q3-Disk Management
5 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet

Lec 10 Distributed Databases System

Uploaded by

Lec 10 Distributed Databases System

Uploaded by

Distributed Databases System

⚫ Query Decomposition: Break the query into:

⚫ Execution: Each subquery runs locally at Site A and Site

Without fault tolerance:

With fault tolerance:

You might also like