Mod-I (Distributed DBMS)
Mod-I (Distributed DBMS)
A distributed database is a collec on of databases that are spread across mul ple physical loca ons but func on as a
unified system. Each site in the system is managed independently and communicates with other sites to maintain
consistency. The primary goal of distributed databases is to enhance scalability, availability, and fault tolerance.
1. Data Distribu on: Data is stored across mul ple sites, o en close to where it is used, to improve performance.
2. Transparency: Users interact as if it's a single database, despite the physical distribu on.
4. Fault Tolerance: Redundancy ensures that the system remains opera onal even if some nodes fail.
3. Consistency Issues: Achieving data consistency across nodes in real- me can be difficult (CAP theorem:
Consistency, Availability, Par on Tolerance).
Example Scenario: Several generals must decide whether to a ack or retreat, but some may be traitors providing
conflic ng orders. The problem is to ensure that loyal generals reach a consensus regardless of the ac ons of traitors.
Relevance to Blockchain: Blockchain technology addresses this problem using consensus mechanisms like Proof of
Work (PoW) or Proof of Stake (PoS), ensuring the network can tolerate faults and malicious actors.
Examples:
Key Features:
Transparency: Users perceive files as local, hiding the complexi es of distribu on.
Features of DHT:
Applica ons:
In the context of blockchain technology, this problem becomes central because blockchains operate in a decentralized
environment with no central authority, requiring all nodes to agree on the validity of transac ons and blocks.
Nodes (like generals) need to reach a consensus on the next block to add to the blockchain.
Some nodes might be faulty or malicious (like traitors), a emp ng to disrupt consensus by spreading false or
conflic ng informa on.
The goal is for the honest nodes to agree on a single truth (the valid blockchain state), regardless of the
behavior of malicious nodes.
1. Consensus Mechanisms: Blockchains use robust consensus algorithms to ensure all honest nodes agree on the
next block. Examples include:
o Proof of Work (PoW): Nodes (miners) solve complex computa onal puzzles to propose blocks. The
first valid solu on is accepted, as it's computa onally infeasible for malicious actors to control most of
the network's compu ng power.
o Proof of Stake (PoS): Validators stake cryptocurrency to par cipate in consensus. Honest behavior is
incen vized, and malicious ac ons result in penal es (loss of stake).
o Prac cal Byzan ne Fault Tolerance (PBFT): Used in some private blockchains, where nodes directly
communicate to reach consensus. PBFT is efficient but less scalable.
2. Decentraliza on: Distributed nodes prevent any single point of failure. Even if some nodes are compromised,
the network con nues to operate securely.
3. Redundancy: Blockchains replicate data across all nodes. Malicious nodes cannot alter the blockchain without
controlling a majority of the network, as other nodes retain the correct state.
4. Incen ves: Blockchain systems reward honest behavior (e.g., mining rewards) and penalize dishonest ac ons,
aligning the economic interests of par cipants with the network’s integrity.
Real-World Relevance:
In Bitcoin (PoW), solving the Byzan ne Generals Problem ensures that all miners agree on the valid chain, even
if some a empt a double-spend a ack.
In Ethereum 2.0 (PoS), validators use their stake to propose and a est blocks, ensuring consensus despite
poten al adversarial behavior.
Challenges:
Scalability: Increasing the number of nodes can slow consensus mechanisms like PBFT.
51% A acks: PoW-based blockchains are vulnerable if a single en ty controls more than 50% of the network’s
computa onal power.
Hash Func on: A mathema cal func on that maps data of arbitrary size to fixed-size values (hashes). In DHTs,
the hash func on determines the placement of keys and values across nodes.
Key-Value Pair: Data is stored as key-value pairs, where the key is hashed to determine which node in the
network is responsible for storing the associated value.
How DHT Works in Blockchain:
1. Hashing Keys:
o A cryptographic hash func on (e.g., SHA-256) is used to generate a unique hash for a key.
o This hash determines the posi on of the key-value pair within the distributed system.
2. Distributed Nodes:
o Nodes in a DHT collabora vely store and manage data. Each node is assigned a range of hash values.
o When a key-value pair is added, the hash of the key iden fies the responsible node.
3. Efficient Lookup:
o Nodes communicate using algorithms like consistent hashing to locate the correct node for a specific
key, ensuring efficient data retrieval.
o DHTs provide a decentralized way to store and retrieve data in blockchain systems. For example:
Filecoin and IPFS (InterPlanetary File System) use DHTs for decentralized file storage.
o Some blockchain implementa ons use DHTs to manage off-chain data while keeping on-chain storage
minimal.
o Example: Lightning Network uses DHT principles for rou ng and finding payment channels.
o Systems like ENS (Ethereum Name Service) use DHTs to resolve human-readable names to blockchain
addresses.
o No central authority manages the data; nodes collec vely manage storage and retrieval.
2. Scalability:
3. Fault Tolerance:
o Data redundancy across nodes ensures availability, even if some nodes fail.
4. Efficiency:
o DHT algorithms like Chord or Kademlia ensure that lookups occur in O(logN) me, where N is the
number of nodes.
Challenges of DHT in Blockchain:
1. Data Integrity:
o Ensuring stored data is authen c and unaltered requires robust cryptographic methods.
2. Sybil A acks:
o Malicious actors may introduce numerous fake nodes to disrupt the network.
3. Latency:
o Data lookup may take longer in large-scale networks, especially if nodes are geographically dispersed.