Distributed Databases
Distributed Databases
By
Sudarshan
– Allocation
• Each fragment is stored at site with “optimal”
distribution.
– Replication.
• Copy of fragment may be maintained at several
sites.
Data Allocation
• Rule of thumb:
– If replication is
advantageous,
SELECT AVG(S.age)
FROM Sailors S
WHERE S.rating > 3 AND S.rating < 7
• Horizontally Fragmented: Tuples with rating < 5 at
Mumbai, >= 5 at Delhi.
– Must compute SUM(age), COUNT(age) at both sites.
– If WHERE contained just S.rating>6, just one site.
• Vertically Fragmented: sid and rating at Mumbai,
sname and age at Delhi, tid at both.
– Must reconstruct relation by join on tid, then evaluate the
query.
• Replicated: Sailors copies at both sites.
– Choice of site based on local costs, shipping costs.
Distributed Query Optimization
• Three alternatives:
– Copy all EMPLOYEE and DEPARTMENT records to
node 3. Perform the join and display the results.
Total Cost = 1,000,000 + 3,500 = 1,003,500 bytes
– Copy all EMPLOYEE records (1,000,000 bytes) from
node 1 to node 2. Perform the join, then ship the
results (400,000 bytes) to node 3.
Total cost = 1,000,000 + 400,000 = 1,400,000
bytes
– Copy all DEPARTMENT records (3,500) from node 2
to node 1. Perform the join. Ship the results from
node 1 to node 3 (400,000).
Total cost = 3,500 + 400,000 = 403,500 bytes
Distributed Query Processing – Example
Local
Global
Distributed Deadlock – Solution
• Three solutions:
– Centralized (send all local graphs to one
site);
– Hierarchical (organize sites into a hierarchy
and send local graphs to parent in the
hierarchy);
– Timeout (abort transaction if it waits too
long).
Centralized Approach
• A global wait-for graph is constructed and
maintained in a single site; the deadlock-
detection coordinator
– Real graph: Real, but unknown, state of the
system.
– Constructed graph:Approximation generated by
the controller during the execution of its algorithm
.
• The global wait-for graph can be constructed
when:
– a new edge is inserted in or removed from one of
the local wait-for graphs.
– a number of changes have occurred in a local
wait-for graph.
Example Wait-For Graph for False Cycles
Initial state:
False Cycles (Cont.)
• Suppose that starting from the state shown in
figure,
1. T2 releases resources at S1
• resulting in a message remove T1 → T2 message from
the Transaction Manager at site S1 to the coordinator)
• Backup coordinators
– site which maintains enough information locally to
assume the role of coordinator if the actual
coordinator fails
– executes the same algorithms and maintains the
same internal state information as the actual
coordinator fails executes state information as the
actual coordinator
– allows fast recovery from coordinator failure but
involves overhead during normal processing.
• Election algorithms
– used to elect a new coordinator in case of failures
– Example: Bully Algorithm - applicable to systems
where every site can send a message to every
Bully Algorithm
• Based on voting
– To lock a data item:
• Send a message to all nodes that maintain a
replica of this item.
• If a node can safely lock the item, then vote
"Yes", otherwise, vote "No".
• If a majority of participating nodes vote "Yes"
then the lock is granted.
• Send the results of the vote back out to all
participating sites.
Normal Execution and Commit Protocols