Week 5 - Introduction to Distributed Databases
Week 5 - Introduction to Distributed Databases
ke
OUTLINE
• Distributed Transactions
• Distributed Deadlock
across multiple sites or nodes, often connected through a network. Each node can
independently process data and execute queries, while the entire system appears as a single,
Explanation: Distributed databases are designed to address various challenges, such as data
distributed database system. This distribution can be based on factors like data access
network, enabling them to communicate and share data. These networks can be
Abdulrahman A. Mohamed Mobile: +254 713 500 814 Email: [email protected]
local area networks (LANs), wide area networks (WANs), or even the internet.
each node to manage its portion of the data independently. This means that changes
system as data volume or user load increases. This helps handle growing data and
user demands.
across multiple nodes. If one node fails, data can still be accessed from other nodes,
1. Global Retail Chain: A global retail chain operates in multiple countries, each with
its own store database. These store databases are part of a distributed database
system. When a customer places an order online, the system seamlessly retrieves
inventory data from the nearest store's database, processes the order, and updates
the inventory. The distributed database ensures data consistency and availability
2. Social Media Platform: A social media platform stores user profiles, posts, and media
When a user logs in, the system retrieves data from the nearest data center,
departments may have their databases, which are part of a distributed system. Cross-
• Cloud computing providers like Amazon Web Services (AWS), Microsoft Azure, and
Google Cloud Platform offer distributed database services that enable organizations
• IoT (Internet of Things) applications generate massive amounts of data that can be
decision-making.
2. Distributed Transactions:
database system that are treated as a single, indivisible unit of work. These operations can
span multiple nodes, and the transaction must maintain ACID (Atomicity, Consistency,
operations across multiple nodes while ensuring data consistency and integrity. Here's a
detailed breakdown:
single unit. All operations within the transaction must either succeed or fail as a
whole.
systems often use isolation levels to control the visibility of changes made by one
even in the face of system failures. Distributed systems employ various techniques
banking system, a distributed transaction ensures that the amount is deducted from
one account and added to another, even if these accounts are managed on different
chain, when a new product is added to the inventory, the transaction ensures that
the product details are updated in all relevant databases across various store
locations.
availability, deducting the purchase amount, and updating order history records, all
• Distributed transactions are widely used in banking and financial systems to ensure
• They play a crucial role in e-commerce platforms, order processing systems, and
supply chain management, where data consistency and integrity are vital.
data centers.
distributed transaction, with nodes across the network reaching consensus on the
3. Distributed Deadlock:
to ensure data consistency and integrity. Locks prevent multiple processes from
simultaneously modifying the same resource.
• Resource Conflicts: Distributed deadlock occurs when multiple processes or
transactions acquire some locks and then attempt to request additional locks that
are currently held by other processes. This results in a circular wait condition where
each process is waiting for a resource held by another.
• Deadlock Detection and Resolution: Detecting and resolving distributed deadlocks
require specialized algorithms and coordination mechanisms. These mechanisms
may involve a central deadlock detection process or a distributed approach where
nodes communicate to resolve deadlocks.
Examples and Scenarios:
1. Database Transactions: In a distributed database system, multiple transactions across
different nodes may access and update data items. If one transaction holds a lock
on a data item and needs to acquire a lock held by another transaction on a different
node, a distributed deadlock can occur.
2. Resource Sharing in a Cloud Cluster: In a cloud computing environment, multiple
virtual machines (VMs) or containers may share common resources such as CPU,
memory, or storage. If VM A requests access to a resource currently used by VM B,
and VM B simultaneously requests access to a resource held by VM C, a circular wait
among VMs can lead to a distributed deadlock.
3. Distributed File System: In a distributed file system, multiple clients may access
shared files. If two or more clients request exclusive access to files while holding
locks on other files, and these locks conflict with other clients' requests, a distributed
deadlock can occur.
Practical Application in the Field:
• Distributed deadlock resolution and prevention mechanisms are essential in
distributed systems, especially in databases, cloud computing, and distributed file
systems.
• Distributed deadlock detection algorithms, such as the resource allocation graph
algorithm, help identify deadlocks and initiate corrective actions.
Abdulrahman A. Mohamed Mobile: +254 713 500 814 Email: [email protected]