Distributed DBMS
Distributed DBMS
UNIT I
Introduction:
Distributed Data Processing :
Distributed Data Processing (DDP) is a method where computational tasks and data are
divided across multiple interconnected computers (or nodes) in a network. These systems
work together to process data more efficiently than a single centralized system.
In simple terms, rather than storing all data in one place and processing it on one machine,
distributed data processing breaks down tasks and processes them in parallel across
different locations.
Key Concepts:
1. Data Distribution:
o Data is stored across multiple physical locations.
o Can be fragmented (split into parts) or replicated (copies stored in multiple
locations).
2. Parallel Processing:
o Tasks are split and processed simultaneously on multiple nodes.
o This reduces processing time significantly.
3. Coordination:
o A central coordinator or a set of protocols ensures tasks are executed in the
correct sequence.
o Ensures consistency and synchronization across all nodes.
4. Transparency:
o The system hides the complexity of distribution from the user.
o The user feels like they're working with a single system.
Advantages:
Each site (or node) in a DDBS is typically connected via a computer network and can
manage its own data independently. However, the sites work together to support query
processing, transaction management, and data consistency across the entire system.
Key Components:
1. Distributed Database:
o The actual data that is distributed over various sites.
o May be partitioned (fragmented) or replicated.
2. Distributed Database Management System (DDBMS):
o Software that manages the distributed database.
Fragmentation: Data is divided into pieces (fragments), each stored at different sites.
Replication: Copies of the same data are stored at multiple sites to ensure availability.
Hybrid: A combination of fragmentation and replication.
Location Transparency: Users don’t need to know the physical location of data.
Replication Transparency: Users are unaware of how many copies of data exist.
2
Advantages:
Real-World Examples:
Banking Systems: Branches in different cities access and update data stored locally
and globally.
E-commerce: Customer data, orders, and inventory are managed across different
regional servers.
Cloud-Based Services: Like Google, Amazon, and Facebook, which handle massive
data across global data centers.
Promises of DDBSs –
Transparency -
The DDBS hides the complexity of distribution from users. This includes:
Location Transparency: Users can access data without knowing where it's stored.
Replication Transparency: Users don’t need to know how many copies of the data
exist.
Fragmentation Transparency: Users are unaware of how data is split and stored across
sites.
Transaction Transparency: Distributed transactions appear just like local ones.
Failure Transparency: The system recovers from failures without user intervention.
Data replication ensures that data is available from other sites if one fails.
Scalability -
Can handle growing data volumes and increasing user load efficiently.
Performance -
Resource Sharing -
Enables sharing of data and processing power across different departments or locations.
Modularity -
Complicating factors –
Distributed database systems are powerful, but they are also harder to manage than
centralized databases due to the following complications:
1. Network Issues
Latency, packet loss, and network failures can disrupt data transfer between sites.
Performance may degrade due to slow communication links.
4. Concurrency Control
Multiple users accessing distributed data simultaneously increases the risk of conflicts.
Coordinating concurrency control across sites is harder than in a single system.
5. Distributed Transactions
6. Consistency Maintenance
8. System Heterogeneity
Different sites may use different DBMS software, hardware, operating systems, or data
models.
Makes integration and standardization more complex.
9. Failure Handling
Detecting and recovering from partial failures (like one node going down) requires
robust mechanisms.
Ensuring the system stays consistent after a failure is not easy.
Problem areas –
Distributed Query Processing -
Challenge: How to minimize communication cost, access time, and balance load across
sites.
Example: A query that joins tables stored in different cities can be very slow without
optimization.
Challenge: Coordinating commit and rollback over the network, handling failures during
transactions.
Example: A failure during a banking transfer between two branches must not lead to
inconsistent balances.
Concurrency Control -
Problem: Multiple users might try to access/update the same data simultaneously.
Example: Two users updating the same account balance from different locations at the
same time.
Challenge: Detecting them across the network and resolving without affecting
performance.
Example: Site A waits for Site B, while Site B is waiting for Site A — classic deadlock.
Failure Recovery -
Challenge: Ensuring automatic recovery, logging, and data integrity after a failure.
Challenge: Secure data during transfer and storage, manage user access across sites.
Example: One site may have different security policies than another.
Example: One site uses “Customer_ID,” another uses “CID” for the same data.
5
There are three major models based on how much autonomy each site or node has:
1. Design Autonomy -
Definition:
Each site has the freedom to define its own database schema, data model, and even use a
different DBMS software.
Characteristics:
Sites may use different data models (e.g., relational, object-oriented, NoSQL).
Sites may store data using different naming conventions or structures.
Integration becomes complex due to schema and semantic heterogeneity.
Example:
A hospital stores patient data in a relational database, while a clinic stores data in a NoSQL
document-based database. Both want to share data, but their structures differ.
Pros:
Cons:
2. Communication Autonomy -
Definition:
Each site controls when and how it communicates or participates in the distributed system.
Characteristics:
Example:
In a federated university system, some universities may allow their student databases to be
queried by others only during admission season.
Pros:
Cons:
3. Execution Autonomy
Definition:
Each site can execute its local operations independently without needing to wait for global
decisions.
6
Characteristics:
Example:
A retail store processes customer billing locally even if it’s part of a large retail chain.
Central servers sync data only at end of day.
Pros:
Cons:
pgsql
+--------------------+
| Site A (Own DBMS) |
+--------------------+
↑ ↓ Communication Autonomy
+--------------------+
| Site B (Own Schema) |
+--------------------+
Distribution –
Distribution refers to the logical or physical placement of data across multiple
interconnected locations (sites, nodes, or servers) in a Distributed Database System.
7
Instead of storing all data in a single central database, it is split and stored across different
sites that are connected via a network.
1. Fragmentation
Example: A customer database where customers from the North region are stored in server
A, and those from the South in server B.
2. Replication-
Example: Product inventory data is replicated across warehouses to ensure fast local
access.
3. Hybrid Distribution-
Example: Product data is fragmented by region and replicated in nearby data centers.
Location Transparency: Users don’t need to know where the data is stored.
Replication Transparency: Users are unaware of how many copies exist.
Fragmentation Transparency: Users see the data as one unified dataset.
Different sites may use different DBMS products (e.g., Oracle, SQL Server, MySQL).
The systems may differ in data models, query languages, hardware, or operating
systems.
8
Despite differences, they are connected via a network and work together to process
queries and transactions.
1. Client
2. Server
3. Middleware
Key Features -
Feature Description
Interoperab Supports multiple DBMSs with different data models and query
ility languages.
Transparenc Users don't need to know the data's physical location or DBMS
y type.
Scalability Easy to add new servers or DBMSs.
Fault If one server fails, others can still work.
Tolerance
Example -
All are managed under a heterogeneous DDBMS using client/server architecture. A client
from any branch can:
Advantages -
Peer to peer –
In Peer-to-Peer DDBMS architecture, all nodes (or sites) in the distributed system are
considered equal. Each site:
Feature Description
Decentralized No single site is in charge. Every peer can operate independently.
Control
Autonomy Each node controls its own data and DBMS.
Heterogeneity Different peers may use different DBMSs, OSs, hardware, data
models, etc.
Collaboration Peers communicate directly to fulfill distributed queries or
transactions.
Components -
1. Peers (Nodes)
o Each peer:
Runs a DBMS (could be different from others).
Stores a portion of the distributed database.
Can query, update, or share data with other peers.
2. Communication Layer
o Handles message passing between peers.
o Ensures that peers can send requests and responses effectively.
3. Query Processing & Translation Mechanism
o Needed to translate a query from one peer’s format to the target peer’s DBMS
format.
Example -
Challenges -
MDBS –
MDBS stands for Multidatabase System.
It's a system that enables users to access and manipulate multiple autonomous databases,
which are:
Feature Description
Heterogen Supports different types of DBMSs (e.g., Oracle, MySQL,
eity MongoDB)
Autonomy Each local DBMS works independently
Transpare Users don’t see the complexity of distributed and heterogeneous
ncy systems
Non- Does not require changes in the local databases
intrusive
MDBS Architecture -
MDBS uses a middleware layer (often called a Global Database Management System –
GDBMS) between the user and the local DBMSs.
Components:
1. User/Application
o Sends queries as if querying a single global database.
2. Global Schema
o A unified logical view of all the underlying databases.
o Hides differences in data models, query languages, and schemas.
3. MDBS Middleware (GDBMS)
o Translates global queries to local queries.
o Sends sub-queries to respective local DBMSs.
o Collects, integrates, and sends the final result back to the user.
4. Local DBMSs
o Independent databases with their own DBMSs.
o Handle sub-queries and return results to middleware.
Example Scenario -
MDBS:
Type Description
Location Users don’t know where data is stored.
Transparency
Replication Users don’t see duplicated data.
Transparency
Heterogeneity Users don’t deal with different query languages or data
Transparency models.
Schema Transparency Users see one global schema, not the individual ones.
Advantages
UNIT II
Data Distribution Alternatives:
1. Centralized
2. Fragmentation
3. Replication
4. Localized Data (your focus)
5. Hybrid approaches
Aspect Description
Autonomy Each site operates independently.
Simplicity Easy to manage, no complex synchronization
needed.
No Redundancy Data is not replicated across sites.
Low Communication Since data access is local, network usage is
Overhead minimal.
Example -
If an employee from Mumbai accesses the system, only the Mumbai DB is involved.
Advantages -
Disadvantages -
Use Cases -
Example -
scss
Employee(EmpID, Name, Address, Phone, Salary, Department)
css
EmpID, Name, Address
EmpID is kept in both fragments as it is the primary key and is needed to reconstruct the
original table using a join.
Improves performance: Users at different locations may only need specific columns.
Enhances security: Sensitive data (like salary) can be kept separate.
Saves bandwidth: Only required attributes are transferred or accessed.
You can get the original table back by performing a natural join or equijoin on the primary
key (e.g., EmpID):
sql
SELECT *
FROM Fragment1
NATURAL JOIN Fragment2;
Advantages -
Disadvantages -
You want to separate sensitive attributes like salary, login credentials, etc.
Horizontal Fragmentation divides a table by rows (tuples) based on certain conditions. Each
fragment contains a subset of rows, but all columns.
Definition:
Primary Horizontal Fragmentation divides a table based on conditions applied to the table's
own attributes.
Example:
Table: EMPLOYEE
n s
104 Neha 35 IT Bangalor
e
Fragmentation:
sql
SELECT * FROM EMPLOYEE WHERE City = 'Mumbai';
sql
SELECT * FROM EMPLOYEE WHERE City != 'Mumbai';
Definition:
Example:
Two tables:
Now, we want to fragment EMPLOYEE so that each employee is in the same site as their
department.
That means:
sql
EMPLOYEE_Fragment_1 = SELECT * FROM EMPLOYEE
WHERE DeptID IN (SELECT DeptID FROM DEPARTMENT WHERE Location = 'Delhi');
sql
EMPLOYEE_Fragment_2 = SELECT * FROM EMPLOYEE
WHERE DeptID IN (SELECT DeptID FROM DEPARTMENT WHERE Location = 'Mumbai');
Disadvantages -
Summary Table -
Type Based On How it Works Query Reconstruction
Prima Conditions on same Filters rows using conditions like UNION of all
ry table City='Delhi' fragments
Deriv Related table's Uses JOIN or IN to select JOIN with related
ed fragmentation matching rows fragments
Hybrid –
This method gives maximum flexibility for optimizing performance, security, and data
access patterns in distributed systems.
✅ Example:
Example:
✅ Advantages:
❌ Disadvantages:
When different regions need different rows and different departments need different
columns
When applications frequently access certain subsets of both rows and columns
In large, complex distributed systems needing optimization and modularity
general guidelines –
1. Completeness
✅ All data from the original relation must appear in the fragments.
No data should be lost. When we reconstruct the fragments, we must get back the entire
original table.
Example:
If we fragment a STUDENT table into STUDENT_Mumbai and STUDENT_Delhi, together they
must contain all rows.
2. Reconstructability (Recomposability)
✅ It must be possible to rebuild the original relation from the fragments using relational
operations like:
Why?
To ensure the original data view is not lost and can be used if needed by global queries.
3. Disjointness
✅ In horizontal fragmentation, the same row should not appear in multiple fragments.
Note: In vertical fragmentation, attributes may overlap only if the primary key is repeated
(for reconstruction).
4. Minimality
Query performance
Data locality
Security or privacy
5. Data Locality
6. Application Affinity
Look at how applications use the data—which columns and rows they mostly access.
This ensures:
Faster queries
Less data movement
Better user experience
✅ Sensitive data (like salary, passwords, etc.) can be placed in separate fragments with
restricted access.
8. Ease of Management
correctness rules –
1. Completeness
Rule: Every item of data in the original relation must appear in at least one fragment.
Goal: Ensure no data is lost during fragmentation.
✅ This guarantees that you don’t lose any tuples or attributes when breaking the table
apart.
Example:
If a CUSTOMER table has 1000 rows, all 1000 rows must be distributed across fragments—
none should be missing.
2. Reconstruction (Recomposability)
Goal: Ensure that we can get back the original table if needed.
18
Example:
Fragments A and B of EMPLOYEE can be joined (vertically) or unioned (horizontally) to
rebuild the full EMPLOYEE table.
3. Disjointness
Rule: In horizontal fragmentation, each tuple (row) should appear in only one
fragment.
For vertical fragmentation, overlapping is allowed only for the primary key (needed for
joining).
Note: This rule may be relaxed in some advanced DDBMS where replication is involved, but
for pure fragmentation, disjointness is essential.
It includes:
Location transparency
Fragmentation transparency
Replication transparency
Access and naming transparency, etc.
Location Transparency -
Definition:
Location Transparency means that the user can access data without knowing its physical
location (site) in the network.
In other words, the data looks like it’s all in one place, even if it's stored across multiple
locations.
Example:
Site 1: CUSTOMER_Mumbai
Site 2: CUSTOMER_Delhi
sql
SELECT * FROM CUSTOMER WHERE City = 'Mumbai';
✅ With location transparency, the user does not need to specify that the "Mumbai" data is
at Site 1.
The DDBMS handles it internally.
sql
SELECT * FROM SITE1.CUSTOMER_Mumbai;
This is complex and tightly couples the query to the site structure.
19
How It Works:
The DDBMS uses middleware, data dictionary, and query processors to locate the
data.
It rewrites and optimizes the query based on data location and network efficiency.
Fragmentation –
Fragmentation Transparency means that users are not aware of how a database table has
been fragmented (broken into parts) and distributed across different sites.
The system hides the details of fragmentation (horizontal, vertical, or hybrid) from users.
Why It Matters?
In a fragmented database:
Example:
A user runs:
sql
SELECT * FROM EMPLOYEE WHERE Country = 'India';
With fragmentation transparency, the DDBMS knows it should query EMPLOYEE_India only,
even though the user wrote just EMPLOYEE.
Types of Fragmentation:
1. Horizontal Fragmentation
→ Rows (tuples) are divided across fragments.
2. Vertical Fragmentation
→ Columns (attributes) are divided across fragments.
3. Hybrid Fragmentation
→ Combination of both horizontal and vertical.
No User must know fragment structure and write Full knowledge required
Transparency queries accordingly
The Query Processor figures out where each piece of data is located.
It rewrites the user's query to target the correct fragments.
It merges the results and gives a unified output.
Replication –
Replication Transparency means the user does not need to know that data is copied and
stored at multiple sites.
The DDBMS manages all copies of data behind the scenes, ensuring consistency and
availability.
What is Replication?
Example:
Mumbai
Delhi
Bangalore
User query:
sql
SELECT * FROM PRODUCT WHERE Category = 'Electronics';
The user doesn't know (or care) where the query is executed.
The system automatically uses the nearest or least loaded replica.
The user sees a single, consistent result.
Replication Types -
Type Description
Full Entire database is copied at every site
Replication
Partial Only some tables/fragments are copied
Replication
No Replication Only one copy of data exists (not ideal in
DDBMS)
How DDBMS Ensures Transparency:
Without a GDD, the system (and the user) lacks centralized knowledge about where and
how the data is distributed.
This creates major challenges when executing queries.
EMP_INDIA at Site A
EMP_USA at Site B
✅ With GDD:
User query:
sql
SELECT * FROM EMPLOYEE WHERE Country = 'India';
Without GDD:
sql
SELECT * FROM SITE_A.EMP_INDIA WHERE Country = 'India';
22
➡️This creates tight coupling between application and data location — difficult to maintain
or scale.
When a table in a distributed system is fragmented or replicated, the GDD keeps track of:
This means the GDD holds complete knowledge of data distribution across the entire
system.
Suppose you have a table CUSTOMER that is horizontally fragmented like this:
sql
SELECT * FROM CUSTOMER WHERE Region = 'Asia';
All this happens automatically — the user doesn’t need to know the internal structure or
location.
Location Transparency: Users write normal queries without knowing where data is
stored.
Efficient Query Routing: System can send queries to the right site, avoiding
unnecessary network traffic.
23
Example on fragmentation –
Sample Table: EMPLOYEE
EmpI EmpNam Dep Salar City
D e t y
101 Aditi HR 3000 Mumba
0 i
102 Ramesh IT 5000 Pune
0
103 Neha Sale 3500 Delhi
s 0
104 Karan IT 5200 Mumba
0 i
105 Riya HR 3100 Chenn
0 ai
1. Horizontal Fragmentation -
Example:
Use:
Each site handles local employee data. Queries related to Mumbai employees are routed
only to Fragment 1.
2. Vertical Fragmentation -
In vertical fragmentation, columns (attributes) are split, usually keeping the primary key in
all fragments.
Example:
3. Hybrid Fragmentation -
Example:
UNIT III
Semantic Data Control:
View Management -
Semantic Data Control refers to the techniques used in a Distributed Database
Management System (DDBMS) to manage how users interact with data meaningfully and
securely. It ensures that users see the data they’re allowed to see, in the way that makes
the most sense to them — without exposing the complexities of data distribution.
View Management -
What is a View? - A view is a virtual table — it doesn’t store data physically, but is
created by a query that pulls data from one or more base tables.
Example:
sql
CREATE VIEW IT_Employees AS
SELECT EmpID, EmpName, Salary
FROM EMPLOYEE
WHERE Dept = 'IT';
Here, users querying IT_Employees will only see IT department employees, even though
the data may be stored in various fragments or locations.
In a distributed system:
Assume:
25
sql
CREATE VIEW All_Employees AS
SELECT * FROM EMP_Pune
UNION
SELECT * FROM EMP_Mumbai
UNION
SELECT * FROM EMP_Delhi;
To users, it appears as one single All_Employees table — they don’t need to know about
fragmentation.
What is Authentication?
Authentication is the process of verifying the identity of a user or system trying to access a
database.
It ensures that only authorized users are allowed to connect and perform actions in the
database system.
a. Internal Authentication
Username and password are stored within the DBMS (e.g., Oracle, MySQL,
PostgreSQL).
Passwords are often stored in hashed form.
Suitable for smaller environments.
b. External Authentication
c. Third-Party Authentication
7. Example (MySQL)
Creating a User:
sql
CREATE USER 'abhi'@'localhost' IDENTIFIED BY 'StrongPass@123';
Granting Privileges:
sql
GRANT SELECT, INSERT ON CompanyDB.* TO 'abhi'@'localhost';
Viewing Users:
sql
SELECT User, Host FROM mysql.user;
8. Authentication in Distributed Databases
In Distributed DBMS:
OS authentication –
Operating System (OS) Authentication allows users to access a Database Management
System (DBMS) without entering a separate database username and password.
Instead, the DBMS uses the user's OS login credentials to verify their identity.
✅ Centralized User Management – You manage users at the OS level, not separately in the
DBMS.
✅ Simplified Login – Users don’t need to remember multiple passwords.
✅ Enhanced Security – Uses secure OS-level protocols and tools.
✅ Integration – Often used in enterprise environments where OS authentication is already in
place.
How It Works -
In Oracle, users are created with the prefix OPS$ (default) to enable OS authentication.
sql
CREATE USER OPS$ABHI IDENTIFIED EXTERNALLY;
GRANT CONNECT TO OPS$ABHI;
Now, if OS user ABHI logs in from the OS shell, Oracle recognizes and allows them in.
Example: If your Linux user is abhi, and there’s a PostgreSQL user abhi, you can log in with:
bash
psql -U abhi
Security Considerations -
Advantages -
Disadvantages -
Access Rights –
What are Access Rights?
Access rights (also known as privileges or permissions) define what actions a user or role is
allowed to perform on database objects like tables, views, or procedures.
They are essential for controlling user activities and securing data within a database.
Example (SQL):
sql
GRANT SELECT, INSERT ON Employees TO Abhi;
REVOKE INSERT ON Employees FROM Abhi;
Instead of assigning permissions to individual users, permissions are assigned to roles, and
users are assigned roles.
Example:
sql
CREATE ROLE HR_Manager;
GRANT SELECT, UPDATE ON Employees TO HR_Manager;
GRANT HR_Manager TO Abhi;
Best Practices -
It enforces data validity and consistency based on the meaning of the data, not just its
format or type.
In a Centralized DBMS -
In centralized systems, all data resides in a single database, so enforcing semantic integrity
is straightforward.
29
Enforcement:
In a Distributed DBMS -
In distributed systems, the database is spread across multiple sites, which introduces
additional challenges in maintaining semantic integrity.
Challenges:
Enforcement Techniques:
a. Global Constraints Management
Sites enforce constraints locally, but sync with others for global constraints.
a. Performance Overhead
Constraint checking (e.g., CHECK, FOREIGN KEY, UNIQUE) adds extra operations
during INSERT, UPDATE, DELETE.
Query execution can be slower, especially if semantic checks involve joins or lookups
in related tables.
b. Storage Cost
c. Complexity in Implementation
a. Network Latency
Cross-site checks (e.g., verifying a foreign key on another server) increase response
time.
Two-phase commit (2PC) protocol for atomic transactions across sites is expensive.
Replication needs synchronization to maintain consistency.
Enforcing constraints may require transferring remote data over the network.
d. Conflict Resolution
Replicated or fragmented data may lead to conflicts (e.g., duplicate primary keys),
requiring resolution strategies, which are costly to implement and run.
Trade-offs -
Due to the high cost, especially in distributed systems, database designers must balance:
Integrity vs Performance
Global vs Local enforcement
Immediate vs Deferred constraint checking
Optimization Strategies -
The Query Processing Problem involves the challenge of transforming a high-level query
(SQL) into an efficient execution plan that optimizes performance while ensuring
correctness. The problem arises due to factors like:
The DBMS parses the SQL query into an internal format, usually a query tree or
abstract syntax tree (AST).
Syntax and semantic errors are checked during this stage.
The query is optimized to minimize the use of resources like CPU, memory, and disk
I/O.
Transformation rules (such as converting subqueries to joins or simplifying
expressions) may be applied.
Cost-based optimization: The DBMS considers different strategies (index scan, full
table scan, etc.) and estimates the cost of each strategy.
Heuristic optimization: The system applies common rules (like pushing down
selections or projections) to improve performance.
The execution engine runs the plan, performing the necessary operations to retrieve
or modify the data.
It processes the data and returns the result to the user.
a. Optimization Complexity
Finding the best execution plan for a query can be a NP-hard problem, especially with
multiple joins or complex predicates.
The optimizer needs to balance between time and space complexity.
32
Join ordering is a critical decision, as the order of joins can drastically affect the
execution time.
For example, performing a nested loop join on large tables can be inefficient, while a
hash join might be faster.
a. Heuristic Optimization
b. Cost-Based Optimization
The query planner generates multiple query plans and estimates their cost (based on
factors like I/O, CPU usage, etc.) before selecting the cheapest plan.
o Cost models are used to estimate the execution time of different query plans.
c. Join Algorithms
Query:
sql
SELECT name
FROM employees e, departments d
WHERE e.dept_id = d.dept_id AND d.dept_name = 'Sales';
Parsing and translation is the first layer in the query processing pipeline. This stage
involves syntax checking and semantic analysis of the query.
a. Parsing- The parsing stage involves taking the SQL query submitted by the user and
converting it into a format that the database system can understand and process, such as
an internal representation like a query tree or abstract syntax tree (AST).
Steps in Parsing:
1. Lexical Analysis: The query is broken down into individual tokens (e.g., keywords like
SELECT, FROM, WHERE, column names, and operators like =).
2. Syntax Analysis: The tokens are analyzed according to the syntax rules of the SQL
language. This ensures the query follows the correct syntax, checking for things like
proper use of commas, parentheses, and SQL keywords.
3. Semantic Analysis: The DBMS checks that the query makes sense in terms of the
schema of the database. For example, it checks that the table and column names
exist, the data types are compatible, and joins are correctly defined.
If any syntax or semantic errors are found, the parser will return an error message. If the
query is correct, it is then converted into an internal format for further processing.
b. Translation- Once parsing is done, the query is translated into an internal representation
(usually in the form of a relational algebra expression or a query tree).
Steps in Translation:
Logical Plan Generation: The query is converted into a logical query plan. This plan
outlines the steps needed to retrieve the result based on relational algebra operations
like selection, projection, join, etc.
Query Tree Construction: The logical query plan is represented in the form of a query
tree or abstract syntax tree (AST). Each node in the tree represents an operation (e.g.,
selection, projection), and the edges represent data flows between operations.
sql
SELECT name FROM employees WHERE age > 30;
This intermediate representation helps the DBMS understand the required operations
before optimization.
Once the parsing and translation stages are complete, the query will move on to further
stages such as:
Execution: The optimized query plan is executed, and the database engine retrieves
the result.
The query tree is a hierarchical structure representing the logical operations required
to execute the query. It breaks down the query into smaller operations like select, join,
project, etc.
b. Relational Algebra
The query is often translated into relational algebra expressions. These expressions
define operations like:
o Selection (σ): Filtering rows based on a condition.
o Projection (π): Selecting specific columns from a table.
o Join (⨝): Combining rows from two or more tables based on a related column.
The abstract syntax tree is another form of internal representation. It’s used to
represent the hierarchical structure of the query’s components without the detailed
syntax information.
SQL Query:
sql
SELECT name, salary FROM employees WHERE age > 30;
The DBMS checks that name, salary, and age are valid column names in the
employees table.
It checks that the condition age > 30 is valid (i.e., the column age exists and is of a
numeric type).
Error Detection: Helps identify syntax and semantic errors early in the process.
Simplification: The translation process breaks down complex queries into simpler
operations, making optimization easier.
Foundation for Optimization: The query tree or relational algebra expressions are used
to explore different execution strategies and choose the most efficient one.
35
Optimization –
Query optimization is the process of finding the most efficient execution plan for a given
SQL query. The goal is to minimize the cost of executing a query in terms of resources like
CPU time, disk I/O, and memory usage, while still ensuring the correctness of the result.
The query optimization process takes place after the parsing and translation of the query
(into a query tree or relational algebra) and before the execution of the query. Optimization
aims to improve performance by reducing the computational cost.
Once the query has been parsed and translated, the DBMS creates multiple candidate
execution plans. These plans differ in how operations like joins, projections, and selections
are performed.
sql
SELECT name, salary FROM employees WHERE age > 30;
A full table scan on the employees table, applying the selection (age > 30) later.
Using an index on the age column and applying the selection before the projection.
Cost Estimation -
Each candidate execution plan is associated with a cost. The DBMS uses a cost model to
estimate the resource usage of each plan. The cost typically includes:
The optimizer compares the costs of different plans and selects the one with the minimum
cost.
Heuristic Optimization -
Heuristic optimization involves applying a set of rules or heuristics to improve the query
without considering the actual cost. These rules are generally used to simplify and reorder
operations in the query tree.
Push selection: Move a selection operation as close as possible to the data (i.e.,
applying filters earlier in the process to reduce the data size).
Push projection: Move a projection operation to eliminate unneeded columns as early
as possible.
Commutativity of join: Change the order of joins to reduce the size of intermediate
results.
Join elimination: Eliminate unnecessary joins that don’t affect the result (e.g., if one of
the tables is not actually used in the output).
Example Heuristic:
sql
SELECT name FROM employees, departments WHERE employees.dept_id =
departments.dept_id AND departments.name = 'HR';
36
A common heuristic would push the WHERE clause filtering departments.name = 'HR' as
close as possible to the departments table, reducing the number of rows involved in the
join operation.
Cost-Based Optimization -
Cost-based optimization involves using statistics about the data (such as table size, index
usage, and data distribution) to estimate the execution cost of different query plans and
choose the most efficient one.
Cost Model -
The cost model calculates the expected cost of each candidate plan, based on factors such
as:
Example:
Full Table Scan: The cost will depend on the number of pages in the table.
Index Lookup: The cost will depend on whether an index exists on the age column and
the selectivity (i.e., how many records satisfy age > 30).
Code generation –
Steps in Code Generation:
After query optimization, the DBMS generates a query execution plan. This plan is a series
of low-level operations that represent the most efficient way to access, filter, join, and
return data based on the SQL query.
The query execution plan is typically represented in the form of operators (e.g., select,
project, join) and their input/output relations.
Instead of directly converting the optimized query plan into machine code, modern DBMS
often translate it into intermediate code. This code acts as a bridge between the high-level
query plan and the machine-specific execution code.
Intermediate Code: The intermediate code is usually in the form of relational operators
(e.g., SELECT, JOIN) expressed in a low-level language that can be interpreted by the DBMS.
sql
SELECT name, salary FROM employees WHERE age > 30;
markdown
1. Scan employees table.
2. Apply selection condition (age > 30).
3. Apply projection (name, salary).
4. Return results.
Once the intermediate code is ready, the DBMS translates it into the machine code or a set
of low-level instructions that can be directly executed by the system. This is the final code
that interacts with the operating system and hardware to fetch data from the database.
In some DBMS systems, this machine code might be translated into a query execution
engine or virtual machine code that can be interpreted by the DBMS's internal runtime.
For a query involving a table scan, the machine code might involve low-level instructions
for accessing disk pages, reading data blocks, and applying filters on the fly.
Execution of Code -
Once the machine code (or intermediate code) is generated, the DBMS executes the
instructions step by step to retrieve the result of the query. The execution involves:
sql
SELECT name, salary FROM employees WHERE age > 30;
The query is parsed and translated into a query tree or logical plan.
Step 2: Optimization
markdown
1. Scan employees table.
2. Apply selection (age > 30).
3. Apply projection (name, salary).
4. Return result.
This intermediate code is translated into low-level machine code that involves reading
blocks of the employees table, applying the condition age > 30, and fetching only the
name and salary columns.
Step 5: Execution
The DBMS executes the generated machine code and returns the result, which is the
name and salary of employees aged over 30.
38
To process this global query in a distributed system, the DBMS needs to translate the global
query into local queries that can be executed at each of the different sites where the data
is located. This process involves several steps, which include query decomposition, local
query generation, and global query optimization.
Global Query:
Let's assume we have a distributed database system with two sites: Site A and Site B. The
tables stored at these sites are as follows:
sql
SELECT employees.name, departments.department_name
FROM employees, departments
WHERE employees.department_id = departments.department_id
This query requires joining the employees table at Site A with the departments table at Site
B. The challenge is to map this global query to local queries that can be executed on each
site.
Query Decomposition -
The first step is to decompose the global query into smaller components. These
components consist of subqueries that can be executed at each site. Query decomposition
identifies how data from multiple sites can be accessed and used to satisfy the global
query.
Join Operation: The global query performs a join between the employees and
departments tables on the department_id column.
Selection: The join is performed using the employees.department_id =
departments.department_id condition.
Decomposition Plan:
Now, based on the decomposition, the system generates local queries to execute at each
site.
At Site A, we need to retrieve the employee names along with their department_id. The
local query is:
39
sql
SELECT employee_id, name, department_id FROM employees
This query can be executed locally at Site A to fetch the relevant data.
At Site B, we need to retrieve the department names along with the department_id. The
local query is:
sql
SELECT department_id, department_name FROM departments
Once the local queries are generated, the system must decide how to execute them:
In this approach, the local data retrieved from both sites can be sent over the network to
the query processor site, where the final join operation is performed.
1. Execute Local Query on Site A: The SELECT query for employees is executed on Site A.
2. Execute Local Query on Site B: The SELECT query for departments is executed on Site
B.
3. Send Results to a Central Site (Query Processor): The results from both sites are sent
to a central site (or the site that issued the query).
4. Join at the Central Site: The results are joined on the department_id at the query
processor site.
Alternatively, if the join condition is highly selective (i.e., there are few employees in each
department), it may be more efficient to ship the join condition directly to the sites. This
means that:
1. Site A Sends Employees Data: Site A sends employee data that matches a certain
department_id to Site B.
2. Site B Performs Join Locally: Site B performs the join with its departments table locally
using the data received from Site A.
3. Return Joined Results: The result is sent back to the user.
Optimization –
Distributed query processing involves optimizing the way data is transferred and processed
between sites. Several factors can impact optimization:
Data Transfer Cost: The cost of transferring data between sites (network latency,
bandwidth).
Join Algorithm: The algorithm used for the join, such as nested-loop join, hash join, or
merge join.
Local Query Execution: How efficiently the local queries can be executed at each site.
Query Plan: Whether it's better to perform the join at one site or distribute the join
operation across sites.
For example, if the employees table at Site A is large and the departments table at Site B is
small, the system might decide to send the departments table to Site A rather than sending
the entire employees table to Site B. This minimizes the amount of data transferred.
UNIT IV
Optimization of Distributed Queries:
40
Query Optimization –
In Distributed Database Management Systems (DDBMS), query optimization plays a crucial
role in improving the efficiency of query execution. Since data is distributed across multiple
sites, optimization techniques help minimize the total cost of executing queries by reducing
factors such as data transfer time, join costs, and execution time at each site.
The goal of distributed query optimization is to find the most efficient plan to execute a
given query by considering various possible ways the query can be executed across the
distributed system.
1. Join Optimization
Join operations, especially when tables are distributed across different sites, can become a
bottleneck. Efficiently performing joins across sites requires minimizing data transfer costs
and performing joins in an optimized way.
Broadcast Join: If one table is small (e.g., a departments table), it can be sent to all
other sites, and local joins can be performed on the data.
Replicated Join: One of the tables can be replicated across all sites. This minimizes
data transfer but requires careful consideration of memory and storage usage.
Partitioned Join: If the tables are large but have a common partitioning key (e.g.,
department_id), data can be sent to the site with matching keys for local joins.
Cost of Joins:
Cost of transferring data: The more data that is transferred, the higher the cost.
Join algorithms: Choosing between nested-loop join, hash join, or merge join
depending on data distribution.
In a scenario where employees and departments are located on different sites, the query
planner might choose between:
Sending the departments table (if it’s small) to the site containing the employees
table.
Or, alternatively, performing the join on each site where the data is located and then
combining the results.
The location of data is critical for distributed query optimization. Accessing remote data
incurs a cost due to network latency and bandwidth.
Optimization Considerations:
Data Locality: If both tables involved in a query are located at the same site, no data
transfer is needed. Hence, minimizing remote data accesses is key to reducing costs.
Partitioning Awareness: If data is partitioned based on certain attributes (e.g.,
department_id), queries that access data from multiple partitions may need to be
restructured to minimize the amount of data transferred.
41
3. Query Decomposition
Decomposition refers to breaking down a global query into local subqueries that are
executed at the respective sites where the data is stored. The goal is to decompose the
query in such a way that it minimizes the communication overhead and optimizes data
access.
The primary goal of centralized query optimization is to determine the most efficient
execution plan for a given query, reducing factors like query execution time, disk I/O
operations, CPU usage, and memory consumption.
The query optimization process involves a cost-based approach, where different strategies
for executing a query are evaluated based on their estimated cost, and the most efficient
one is selected.
1. Parsing:
o The query is parsed and transformed into a logical query tree or query plan. This
tree represents the structure of the query in terms of operations (e.g., SELECT,
JOIN, GROUP BY, etc.).
2. Logical Query Plan:
o The logical query plan is a high-level representation of the query without
considering physical details like indexes or the specific join algorithms to be
used.
o The logical query is then optimized by considering logical equivalences, such as:
Reordering joins: The order in which tables are joined can significantly affect
query performance.
Pushdown of selections: Selections (WHERE clauses) are pushed as close as
possible to the base tables to reduce intermediate result sizes.
Pushdown of projections: Projections (SELECT clauses) are pushed to avoid
unnecessary columns in intermediate steps.
3. Physical Query Plan:
o Once the logical query is optimized, the query plan is translated into a physical
query plan that specifies the actual operations to be executed, including:
Join algorithms (e.g., nested-loop join, hash join, merge join).
Access paths (e.g., table scans, index scans).
Join order (i.e., the order in which tables are joined).
Sorting and grouping methods.
4. Cost Estimation:
o For each physical plan, a cost estimator calculates the cost of executing that
plan. The cost is usually measured in terms of I/O operations, CPU usage, and
memory usage.
o The DBMS estimates the cost based on statistics about the data, such as:
Size of tables.
Index availability.
Cardinality of intermediate results.
Data distribution.
5. Query Plan Selection:
o The system evaluates the costs of different physical query plans and selects the
one with the minimum estimated cost.
o The selected plan is then executed to produce the result of the query.
1. Selection Pushdown:
42
Example:
sql
SELECT name FROM employees WHERE age > 30;
sql
SELECT name FROM employees WHERE age > 30;
Here, the selection is applied directly when reading data from the employees table.
2. Projection Pushdown:
o This optimization involves moving projection operations (column selections) as
early as possible in the query processing. By selecting only the required columns
early in the query execution, we reduce the amount of data processed at later
stages.
Example:
sql
SELECT name, department FROM employees WHERE salary > 50000;
sql
SELECT name, department FROM employees WHERE salary > 50000;
Here, only the necessary columns (name and department) are fetched, instead of
retrieving all columns.
3. Join Reordering:
o The order in which joins are performed can have a significant impact on the
performance of the query. Reordering the joins based on the size of the tables or
the selectivity of join conditions can minimize the intermediate results and
reduce overall computation.
sql
SELECT e.name, d.department_name
FROM employees e, departments d, projects p
WHERE e.department_id = d.department_id
AND e.project_id = p.project_id;
In this case, if the employees table is much smaller than projects or departments, the
optimizer may choose to join employees with departments first and then join the
result with projects.
The optimizer chooses the best join algorithm based on factors like the size of the
tables and the availability of indexes.
5. Index Utilization:
o If the query involves conditions on columns with indexes, the optimizer may
choose to use the index to speed up data retrieval. Indexes can help avoid full
table scans and reduce I/O costs.
o The optimizer also considers multi-column indexes and composite indexes when
available.
6. Query Rewriting:
o Query rewriting involves transforming the original query into an equivalent query
that may be more efficient. For instance, the optimizer might transform
subqueries into joins or vice versa, depending on which approach will be more
efficient.
Here are a few key distributed query optimization algorithms for join ordering:
Description: This approach evaluates all possible orders in which joins can be
executed.
Advantages: Guarantees finding the optimal solution.
Disadvantages: Computationally expensive and inefficient for large queries with many
joins.
Description: Uses a bottom-up approach, where subproblems (smaller join queries) are
solved first, and their results are reused to solve larger subqueries. It builds the
optimal join order progressively.
Advantages: More efficient than exhaustive search for larger queries.
Disadvantages: Still computationally expensive for very large queries.
3. Greedy Algorithms
Description: This approach chooses the next join based on the local optimal choice
(e.g., choosing the join that minimizes intermediate results or minimizes data
transfer).
Advantages: Faster than exhaustive search and dynamic programming.
Disadvantages: May not find the globally optimal join order.
4. Heuristic-Based Algorithms
Description: Uses heuristics or rules of thumb to determine a good join order, like:
o Small-to-Large: Join the smaller tables first to minimize intermediate result sizes.
o Join Commute: Reorder joins based on the cost of the underlying join operations.
Advantages: Faster and less resource-intensive.
Disadvantages: Does not always guarantee optimal performance.
Description: The query is represented as a join graph where tables are nodes, and join
conditions are edges. The graph is traversed to determine the best join order.
Advantages: Visualizes the relationships between tables and can be optimized
effectively.
Disadvantages: May not always lead to the optimal solution.
1. Atomicity:
o A transaction is atomic, meaning it is all-or-nothing. If any part of the transaction
fails, the entire transaction is rolled back to its initial state.
2. Consistency:
o A transaction ensures that the database starts in a consistent state and ends in a
consistent state, preserving data integrity.
3. Isolation:
o Transactions should be isolated from one another, ensuring that the operations
of one transaction do not interfere with those of another. The effect of a
transaction should be as if it is the only transaction running in the system.
4. Durability:
o Once a transaction has been committed, its changes are permanent, even in the
case of a system failure.
In distributed database systems, transactions can involve multiple databases or sites that
are connected via a network. These systems have to ensure that transactions maintain the
ACID properties across multiple, potentially geographically distributed, locations.
1. Distributed Transactions:
o A distributed transaction involves operations across multiple sites or databases,
which can be heterogeneous (i.e., different DBMSs on different sites).
2. Two-Phase Commit Protocol (2PC):
o This is a common protocol for ensuring atomicity in distributed transactions. It
works in two phases:
Phase 1: The coordinator sends a "prepare" message to all participants.
Each participant responds with a "yes" (ready to commit) or "no" (abort).
Phase 2: If all participants respond with "yes", the coordinator sends a
"commit" message to all, making the transaction permanent. If any
participant responds with "no", the coordinator sends an "abort" message,
and all participants roll back their changes.
3. Three-Phase Commit Protocol (3PC):
o This is an extension of the 2PC, designed to make the protocol more fault-
tolerant by adding an additional phase to avoid blocking in case of failures.
4. Distributed Locking:
o Distributed systems require distributed concurrency control mechanisms to
manage simultaneous access to shared data. These mechanisms ensure that
transactions are isolated and consistent, even when data is stored across
multiple locations.
o Common methods include:
Two-Phase Locking (2PL): A lock acquisition and release protocol that
prevents other transactions from accessing the data being used in the
current transaction.
Timestamp Ordering: Each transaction is assigned a timestamp, and
operations are performed in timestamp order to ensure serializability.
Concurrency Control -
ACID property –
The ACID properties are a set of principles that guarantee the reliable processing of
database transactions. These properties ensure that transactions are executed in a way
that maintains data integrity, even in the presence of errors or system failures.
1. Atomicity
2. Consistency
Definition: A transaction takes the database from one consistent state to another
consistent state. It must preserve the database's integrity constraints (e.g., data
validation rules like primary keys, foreign keys, etc.).
Example: Consider a transaction that transfers money from one bank account to
another. The consistency property ensures that the total amount of money in the
system remains the same before and after the transaction (no money is created or
lost).
Guarantee: If the database was consistent before the transaction, it will remain
consistent after the transaction is completed. All integrity constraints (e.g.,
uniqueness, referential integrity) are maintained.
3. Isolation
Definition: Isolation ensures that the operations of one transaction are not visible to
other transactions until the transaction is completed (i.e., committed). This prevents
data anomalies and ensures that transactions do not interfere with each other.
Example: If two people simultaneously try to withdraw money from the same bank
account, isolation ensures that each transaction is executed as if it were the only one
happening, even if they are running concurrently.
Levels of Isolation: There are different levels of isolation that balance performance
with correctness.
Guarantee: Each transaction runs independently of others, ensuring that intermediate
states are not visible to other transactions and preventing issues like dirty reads, non-
repeatable reads, and phantom reads.
4. Durability
46
Atomicity: Ensures that a transaction is an indivisible unit of work. Either all operations
within the transaction are completed successfully, or none are applied to the
database.
Consistency: Ensures that a transaction brings the database from one valid state to
another, preserving data integrity and adhering to predefined constraints.
Isolation: Guarantees that the operations of one transaction are isolated from the
operations of other concurrent transactions, preventing conflicts and anomalies.
Durability: Ensures that once a transaction has been committed, its effects are
permanent and survive system crashes, power failures, or other types of hardware or
software failures.
2. Concurrency Control
Objective: To allow multiple transactions to run concurrently while ensuring that the
database remains in a consistent state and that the integrity of the data is not
compromised.
Methods: Includes the use of locking mechanisms (e.g., Two-Phase Locking) or
timestamp-based protocols to prevent issues like dirty reads, non-repeatable reads,
and phantom reads.
Goal: Maximize throughput and minimize latency while ensuring that transactions do
not interfere with each other inappropriately.
Objective: To handle situations where transactions are waiting for each other
indefinitely, leading to deadlocks.
Methods:
o Prevention: Through techniques like Wait-Die or Wound-Wait where transactions
are either forced to wait or aborted based on priority.
o Detection and Recovery: Identifying deadlocks after they occur and taking
corrective actions like rolling back one or more transactions to break the cycle.
Objective: To ensure that the database remains operational and consistent even in the
face of system crashes, power outages, or other unforeseen failures.
Methods:
o Write-Ahead Logging (WAL): Ensures that all changes are logged before being
applied to the database to facilitate recovery.
o Replication and Backup: Maintain copies of the data across different systems to
ensure availability and durability.
Goal: Prevent data loss and ensure that once a transaction is committed, its changes
are permanent.
Types of transactions –
1. Flat Transactions
Definition: A basic type of transaction that follows the traditional ACID properties from
start to finish.
Example: Updating a customer's address in a single database operation.
Characteristics:
o Single start and commit/rollback point.
o No sub-transactions.
o Simple and easy to manage.
2. Nested Transactions
3. Distributed Transactions
4. Compensating Transactions
5. Long-Lived Transactions
Definition: Transactions that run for a longer period and may require intermediate
states to be saved.
Example: Online shopping cart where items are added over time before checkout.
Characteristics:
o Can involve partial commits.
o Usually broken into smaller sub-transactions.
o May relax strict ACID properties for flexibility.
6. Real-Time Transactions
7. Interactive Transactions
8. Batch Transactions
9. Temporal Transactions
Ensure that all copies of data across distributed sites remain consistent.
Any concurrent execution of transactions should leave the system in a consistent
state.
Prevent issues like lost updates, dirty reads, or inconsistent reads.
2. Isolation of Transactions
Each transaction should be executed as if it were the only one in the system.
Ensure that interleaved execution of multiple transactions doesn't affect their
correctness.
Guarantees the isolation property of ACID.
3. Serializability
4. Deadlock Handling
Prevent or detect and resolve deadlocks that can occur when multiple transactions
wait indefinitely for resources locked by each other.
Implement timeout, wait-die, wound-wait, or deadlock detection and recovery
mechanisms.
Ensure that transactions proceed smoothly even if some sites or communication links
fail.
Must allow system to operate in partitioned environments without data corruption.
6. Fairness
49
8. Recovery Support
9. Transparency
Make concurrency control transparent to users so they don’t have to worry about
distributed execution details.
Maintain distribution transparency while enforcing correctness.
10. Scalability
Allow the concurrency control mechanism to scale with the number of users, sites, and
transactions without significant degradation in performance.
Occurs when: Two transactions read the same data and then both write updates
without knowing about each other.
Result: The second update overwrites the first one, causing it to be lost.
Example:
o T1 reads X (value 10)
o T2 reads X (value 10)
o T1 writes X = 15
o T2 writes X = 20 → T1's update is lost
Occurs when: A transaction reads data that has been modified by another transaction
but not yet committed.
Result: If the second transaction rolls back, the first transaction has read invalid data.
Example:
o T1 updates X = 50 (not committed)
o T2 reads X = 50
o T1 rolls back → T2 has used incorrect data
3. Non-Repeatable Read
Occurs when: A transaction reads the same data multiple times, but gets different
values because another transaction updated the data in between.
Result: Inconsistent results within the same transaction.
Example:
o T1 reads X = 10
o T2 updates X = 20 and commits
o T1 reads X again → gets 20
4. Phantom Read
Occurs when: A transaction executes a query to retrieve a set of rows, and another
transaction inserts or deletes rows that affect the result of the query during the first
transaction.
Result: Re-executing the same query gives different rows (phantoms).
Example:
o T1 reads all employees with salary > 50000
50
5. Write Skew
Occurs when: Two concurrent transactions read overlapping data, then each
transaction writes to non-overlapping parts based on the data read.
Result: The final state violates a constraint.
Example:
o Constraint: At least one doctor on duty
o T1 reads: Doctor A is on duty
o T2 reads: Doctor B is on duty
o T1 sets A off-duty
o T2 sets B off-duty → both commit → no doctor on duty (constraint violated)
a) Binary Locks
2. Timestamp-Based Protocols
a) Distributed 2PL
c) Quorum-Based Protocols
A transaction must obtain read/write quorums from replicas before accessing data.
Balances availability and consistency in replicated environments.
Why it is needed:
Types of Serializability:
1. Conflict Serializability:
o Two operations conflict if they access the same data item and at least one of
them is a write.
o A schedule is conflict-serializable if it can be transformed into a serial schedule
by swapping non-conflicting operations.
2. View Serializability:
o Weaker than conflict serializability.
o A schedule is view-serializable if it is view-equivalent to a serial schedule.
o View equivalence checks three conditions: same read operations, same final
writes, and each read returns the same value as in the serial schedule.
Enforcement Methods:
Recoverability:
52
Definition: Recoverability ensures that the effects of a committed transaction are not lost
due to the failure or rollback of other transactions. In other words, a schedule is
recoverable if a transaction commits only after all transactions whose changes it has read
have also committed.
Why it is needed:
Types of Recoverability:
1. Recoverable Schedule:
o A transaction Tj reads data written by transaction Ti, and Tj commits only after Ti
commits.
o This avoids the cascading effect of aborting multiple dependent transactions.
2. Cascadeless Schedule:
o A stricter form where transactions are only allowed to read data after it has been
committed.
o Avoids cascading rollbacks.
3. Strict Schedule:
o The strictest form where transactions cannot read or write data modified by
uncommitted transactions.
o Guarantees ease of recovery and avoids both dirty reads and cascading
rollbacks.
Distributed Serializability –
Definition: Distributed serializability is the extension of serializability to distributed
database systems, where data and transactions are spread across multiple sites or nodes.
It ensures that concurrent execution of transactions across the distributed system is
equivalent to some serial execution of the same transactions as if they were executed on a
single system.
1. Global Schedule:
o A global schedule is the union of local schedules from all sites.
o The global schedule must be serializable in the same way as a centralized
schedule.
2. Local vs Global Serializability:
o Local serializability ensures correctness at individual sites.
o However, local serializability does not guarantee global serializability.
o Example: Two sites may execute transactions locally in a serializable order, but
the combined effect might violate global consistency.
3. Global Serialization Graph (GSG):
o A graph representing all transactions and conflicts across all sites.
o If the GSG is acyclic, then the global schedule is serializable.
Challenges:
The basic lock-based protocols use locks (shared or exclusive) to manage concurrent
access to data items. Enhanced versions of these protocols aim to increase concurrency
while maintaining serializability and resolving issues like deadlocks and starvation.
1. Multi-Granularity Locking
What it is: This enhancement allows locks at different granularity levels (e.g., data
items, pages, tables, databases).
Purpose: To improve concurrency by allowing transactions to lock broader or narrower
parts of the data, depending on the situation.
Example: A transaction might lock an entire table when it doesn't need to lock
individual rows, thereby reducing the number of locks and overhead.
How it works:
o S (Shared) Locks: For reading data.
o X (Exclusive) Locks: For writing data.
o Intent Locks: Indicate that a transaction intends to acquire locks on a lower level
(e.g., row-level).
Benefits:
o Reduced contention by using more coarse-grained locks where appropriate.
o Improves performance and reduces deadlock risk compared to locking at the
finest granularity (e.g., row-level).
Drawbacks:
o Complexity in managing locks at multiple levels.
o Can lead to lock escalation (e.g., a row lock might escalate to a table lock).
What it is: Enhanced deadlock detection techniques are implemented to identify and
break deadlocks that occur in distributed or highly concurrent environments.
Purpose: Deadlocks happen when two or more transactions are waiting on each other
indefinitely. Detecting deadlocks and resolving them efficiently is key to maintaining
system performance.
How it works:
o Wait-for Graphs: A directed graph is created where nodes represent transactions,
and edges indicate waiting relationships.
54
o If a cycle is detected in the graph, it indicates a deadlock. The system will then
choose one transaction to abort (rollback) to break the cycle.
Benefits:
o Ensures that the system remains deadlock-free without manual intervention.
o Can operate in distributed systems with high concurrency.
Drawbacks:
o Overhead in detecting deadlocks, especially in large systems.
o Frequent rollbacks of transactions can impact performance.
What it is: These techniques are enhancements to standard Two-Phase Locking (2PL)
that help prevent deadlocks by controlling the order in which transactions wait for
locks.
Wait-Die Protocol: If a younger transaction requests a lock held by an older
transaction, the younger transaction must wait. If an older transaction requests a lock
held by a younger transaction, the older transaction dies (is rolled back).
Wound-Wait Protocol: If a younger transaction requests a lock held by an older
transaction, the younger transaction wounds (preempts) the older transaction, forcing
it to roll back. If an older transaction requests a lock held by a younger transaction,
the older transaction must wait.
Benefits:
o Helps prevent deadlocks by enforcing strict priority rules.
o Reduces the need for frequent deadlock detection and rollback.
Drawbacks:
o Risk of starvation (some transactions may never get executed if they are
repeatedly preempted).
o Requires careful priority management.
What it is: An enhancement to basic timestamp ordering protocols that allows certain
writes to be ignored.
Purpose: To improve the efficiency of the system by reducing unnecessary writes and
allowing greater concurrency.
How it works:
o If a transaction with a later timestamp wants to write a value but the value has
already been updated by a transaction with an earlier timestamp, the write is
ignored (i.e., the update is considered obsolete).
Benefits:
o Increases system throughput and reduces unnecessary work by ignoring certain
operations.
o Allows better handling of transactions with conflicting writes.
Drawbacks:
o Introduces complexity in managing transactions.
o May not work well in environments with heavy writes or high contention.
What it is: This method maintains multiple versions of a data item, which allows
transactions to read old versions while writing new versions of the data.
Purpose: To provide greater concurrency by allowing non-blocking reads and reducing
conflicts between transactions.
How it works:
o Each update creates a new version of the data item.
o Read operations return the latest committed version of the data item that is
consistent with the transaction's timestamp.
o Write operations create a new version of the data item, which is then
timestamped.
Benefits:
55
Multiple granularity –
Multiple Granularity Locking (MGL) is an enhancement to basic locking protocols that allows
transactions to lock data at varying levels of granularity (i.e., different sizes or scopes of
data). This approach increases concurrency and reduces the overhead of lock management
by allowing coarser-grained locks when appropriate, while still maintaining serializability
and data consistency.
1. Granularity Levels:
o Data Item Level: The finest level of granularity where individual data items (like
rows or fields) are locked.
o Page Level: Locks on database pages (group of records).
o Table Level: Locks on entire tables (group of rows or pages).
o Database Level: Locks on entire databases (all tables and data).
2. Lock Types:
o S (Shared) Lock: Allows transactions to read data, but not modify it.
o X (Exclusive) Lock: Allows transactions to read and write data. No other
transaction can access the data.
o IS (Intent Shared) Lock: Indicates that a transaction intends to lock a subordinate
data item (e.g., a specific row or page) in a shared mode.
o IX (Intent Exclusive) Lock: Indicates that a transaction intends to lock a
subordinate data item in an exclusive mode.
3. Hierarchical Locking:
o Multiple granularity allows a hierarchical lock structure, where each transaction
may lock higher-level entities (like tables or databases) and lower-level entities
(like rows or columns) as needed.
o For example, a transaction might acquire an IX lock on a table, which means it
intends to acquire X locks on individual rows within the table later.
o A transaction must release all locks at a level before it can release locks at a
higher level. For example, a transaction cannot release a table-level lock until all
row-level locks within that table are released.
3. Lock Compatibility:
o Locks on different granularity levels must be compatible to allow concurrent
access by multiple transactions. For example:
S (Shared) Lock on a table allows other transactions to acquire S locks on
rows, but not X locks.
IX (Intent Exclusive) Lock on a table allows transactions to acquire X locks
on rows, but prevents acquiring S locks on the table level.
1. Improved Concurrency:
o By allowing coarser-grained locks (such as locking entire tables or pages),
multiple transactions can access different parts of the database concurrently
without interfering with each other, improving throughput.
2. Reduced Locking Overhead:
o Fewer locks are needed for large-scale operations, reducing the complexity and
management overhead of locks compared to fine-grained locking on each
individual data item.
3. Increased Efficiency:
o A transaction can hold a higher-level lock (like a table lock) while locking
individual rows or pages within the table as necessary, allowing it to work on
large chunks of data without constantly acquiring and releasing locks.
4. Maintains Serializability:
o Since MGL is based on the Two-Phase Locking protocol, it guarantees
serializability, the highest level of isolation for transactions.
1. Complexity:
o Implementing multiple levels of locking can be complex to manage, especially in
distributed systems.
o It requires careful coordination between locks at different levels to avoid
conflicts.
2. Deadlock Potential:
o Even though MGL tries to reduce contention, the use of hierarchical locking still
leaves the system vulnerable to deadlocks, especially when transactions lock
different levels of granularity in a circular manner.
3. Lock Escalation:
o There can be cases where a transaction holding many fine-grained locks might
escalate to a coarser-grained lock (e.g., from row-level to table-level lock). This
might reduce concurrency and lead to bottlenecks.
4. Lock Contention:
o High contention for the higher-level locks can still cause transactions to block or
delay each other, especially when trying to lock tables or pages.
Consider a database with three tables: Customers, Orders, and Products. Let’s look at how
a transaction can acquire locks at multiple levels:
1. Transaction 1 (T1):
o T1 acquires an IX lock on the Customers table (indicating it intends to modify
rows in this table).
o T1 then acquires an X lock on a specific row (Customer ID 1001).
o T1 performs its operations on this customer’s data.
2. Transaction 2 (T2):
o T2 wants to access the same Customers table, so it can acquire an S lock on the
Customers table.
o T2 can still read from different rows (as long as those rows are not locked by
other transactions), improving concurrency.
In MVCC, rather than locking data items, the system maintains multiple versions of data
and ensures that transactions interact with these versions according to their timestamps or
commit order. This reduces contention between read and write operations and provides a
way to maintain high concurrency while ensuring serializability and data consistency.
There are various multi-version concurrency control schemes, but the two most common
are Optimistic MVCC and Pessimistic MVCC.
1. Optimistic MVCC
In Optimistic MVCC, transactions execute under the assumption that conflicts will be rare.
The system provides mechanisms to handle conflicts only at the commit time.
How it works:
1. Reading: A transaction reads the data from the latest committed version, or if it
needs to access an item modified by another transaction, it gets the version that
was current when the transaction began.
2. Writing: When the transaction wants to write to the database, it creates a new
version of the data.
3. Validation Phase: At commit time, the system checks if the transaction has
encountered conflicts with other transactions that modified the same data. If
conflicts exist, the transaction is rolled back; if not, the transaction commits, and
the new version of the data is made visible to others.
Advantages:
o High concurrency: Transactions are free to execute without waiting for locks,
increasing throughput.
o Reduced contention: Since conflicts are handled later, most transactions can
commit without being delayed.
Disadvantages:
58
o Higher rollback rates: If conflicts are detected during validation, transactions may
have to be rolled back and retried.
o Potential for wasted work: A transaction could perform a lot of work only to be
invalidated at commit time.
2. Pessimistic MVCC
In Pessimistic MVCC, transactions work with data versions in a way that avoids conflicts
during execution by making sure that a transaction is not interfered with by others. This is
similar to the behavior of traditional locking but in the context of multiple versions.
How it works:
1. Reading: A transaction reads the latest committed version of the data. If a
transaction intends to modify a data item, it locks that item, ensuring no other
transactions can modify it until the transaction is complete.
2. Writing: When a transaction wants to update a data item, it creates a new
version of the data, marking it with its timestamp. Any other transaction
attempting to write to that data will be blocked until the first transaction commits
or aborts.
3. Validation and Commit: After the transaction commits, the new version is made
visible to other transactions, and any older versions are logically obsolete.
Advantages:
o Reduced risk of rollback: Since conflicts are resolved by locking, the transaction
is more likely to commit successfully.
o Better for write-heavy workloads: Pessimistic MVCC ensures transactions are
serialized and prevents conflicts from arising during execution.
Disadvantages:
o Lower concurrency: Transactions are forced to wait for locks, leading to reduced
throughput and performance.
o Potential for deadlocks: If two transactions lock the same data items, a deadlock
could occur.
OCC is based on the principle that transactions can often execute without interfering with
each other, and it defers conflict detection until the end of the transaction. This is
especially beneficial for systems with low contention or read-heavy workloads, where
transactions tend to operate on independent data items.
3. Validation-Based OCC:
o This technique focuses on validating a transaction's operations against other
transactions that have committed or are currently executing. It ensures that a
transaction’s operations do not conflict with other transactions that have already
committed, and if they do, it rolls back the transaction.
o Conflict Detection:
When a transaction enters the validation phase, it checks whether it has
read any data modified by another transaction that has already committed.
If there is such a conflict, the transaction is rolled back.
o Advantages:
Reduced contention: Because no locks are required during the read phase,
it allows higher concurrency.
Flexibility: Transactions are validated only when they are ready to commit,
reducing unnecessary locking.
o Disadvantages:
Validation overhead: The process of checking for conflicts during validation
can introduce additional complexity and overhead.
Risk of frequent rollbacks: If the database experiences a high degree of
contention, many transactions may be rolled back, which can decrease the
system's performance.
4. Priority-Based OCC:
o In Priority-Based OCC, transactions are given priorities based on their timestamps
or other factors, and these priorities are used during the validation phase to
resolve conflicts.
o Conflict Resolution:
When two transactions conflict, the one with the lower priority (usually the
one with the older timestamp) is rolled back to ensure the system proceeds
with the higher priority transaction.
o Advantages:
Helps resolve conflicts by prioritizing transactions based on timestamp or
other criteria, reducing rollback chances for high-priority transactions.
o Disadvantages:
60
1. Local Deadlocks:
o These are deadlocks that occur within a single node or process.
o In a distributed system, each node may handle its own local resources, and
deadlock detection can be handled by the individual nodes. However, local
deadlocks might still affect the overall system, especially if the resource
allocation is interdependent.
2. Global Deadlocks:
o These occur when there is a circular wait involving processes on different nodes
or machines.
o A global deadlock involves transactions that are waiting for resources on other
nodes, creating a cycle in the dependency graph of transactions.
1. Centralized Approach:
In this approach, a central node is responsible for deadlock detection. This central
node receives information from other nodes about the resources they are using and
waits for, and constructs a global wait-for graph.
The graph is periodically checked for cycles (which indicate deadlock). If a cycle is
found, the central node reports a deadlock.
Advantages:
Simple to implement because only one node is responsible for managing the detection
process.
Easier to monitor and maintain as there is a central point of control.
61
Disadvantages:
Scalability issues: As the system grows, the central node might become a bottleneck.
Single point of failure: If the central node fails, the entire deadlock detection
mechanism might collapse.
2. Distributed Approach:
In this approach, each node maintains information about the resources it holds and is
waiting for. Each node also sends messages to neighboring nodes about its status.
Nodes periodically exchange detection messages to build a global wait-for graph and
detect cycles. This method allows decentralized detection without the need for a
central controller.
Advantages:
More scalable since the detection mechanism is distributed across all nodes.
No single point of failure.
Disadvantages:
More complex to implement since there are multiple nodes involved in the detection
process.
Requires efficient communication between nodes to ensure that detection messages
reach the correct recipients.
A wait-for graph is used to represent the relationships between transactions and resources.
Each transaction or resource is represented by a node, and a directed edge is created from
one node to another if one transaction is waiting for a resource held by the other.
Cycle in the WFG: A cycle in the graph indicates a deadlock. If there is a path that
leads back to the original node, it implies that the involved transactions are mutually
waiting for resources, thus causing a deadlock.
Once deadlock is detected in a distributed system, recovery actions need to be taken. The
goal of recovery is to break the deadlock by aborting one or more of the transactions
involved in the cycle.
1. Transaction Abortion:
Abort one transaction: One of the transactions involved in the deadlock is selected for
abortion, allowing the resources it holds to be freed so the other transactions can
proceed.
Rollback of the aborted transaction: The aborted transaction is rolled back, and any
changes it made are undone. This transaction can later be restarted if needed.
The decision on which transaction to abort can be made based on various criteria:
o Transaction priority: Higher priority transactions are not aborted, while lower
priority ones are.
o Transaction cost: The transaction that has consumed fewer resources or has
made fewer modifications can be aborted.
o Transaction wait time: The transaction that has been waiting the longest could be
aborted.
In this approach, a transaction that is detected as part of the deadlock cycle is rolled
back, and then the transaction is restarted from the beginning or from a checkpoint.
This method is often used when transactions have a checkpoints system, allowing
them to restart from a previous stable state, which minimizes wasted work.
62
3. Killing Transactions:
Trade-offs:
Killing transactions immediately can help resolve the deadlock quickly, but it may also
lead to significant waste in terms of computation and resources.
4. Timeout-Based Recovery:
In this approach, if a transaction has been waiting for a resource for too long (beyond
a pre-set timeout), it is considered to be part of a deadlock cycle, and the system
automatically rolls it back or aborts it.
This method does not require an explicit deadlock detection phase but relies on time
constraints to detect potentially stalled transactions.