DistributedDatabases 3
DistributedDatabases 3
• A Distributed Database (DDB) is a collection of • Recently distributed and database technologies are being developed
multiple logically related database distributed over for dealing with vast amount of data which is known as big data
technologies.
a computer network, and a distributed database
management system as a software system that • Then data mining and machine learning algorithms are used to
extract needed knowledge.
manages a distributed database while making the
distribution transparent to the user.
Data Fragmentation, Replication and Allocation Data Fragmentation, Replication and Allocation
• Distributed Databases encounter a number of concurrency control • Dealing with multiple copies of data items:
and recovery problems which are not present in centralized • The concurrency control must maintain global consistency. Likewise the recovery
databases. mechanism must recover all copies and maintain consistency after recovery.
• Dealing with multiple copies of data items • Failure of individual sites:
• Failure of individual sites • Database availability must not be affected due to the failure of one or two sites
• Communication link failure and the recovery scheme must recover them before they are available for use.
• Distributed commit • Communication link failure:
• Distributed deadlock • This failure may create network partition which would affect database availability
even though all database sites may be running.
• Distributed commit:
• A transaction may be fragmented and they may be executed by a number of sites.
This require a two or three-phase commit approach for transaction commit.
• Distributed deadlock:
• Since transactions are processed at multiple sites, two or more sites may get
involved in deadlock. This must be resolved in a distributed manner.
Concurrency Control and Recovery Concurrency Control and Recovery
• Distributed Concurrency control based on a distributed copy of a data
item • Transaction management:
• Primary site technique: A single site is designated as a primary site which serves • Concurrency control and commit are managed by this site.
as a coordinator for transaction management. • In two phase locking, this site manages locking and releasing data items. If
all transactions follow two-phase policy at all sites, then serializability is
guaranteed.
Primary site
Site 5
Site 1
Site 3 Site 2
• Transaction that updates data on two or more systems • Each computer runs a transaction manager
• Responsible for subtransactions on that system
• Transaction managers communicate with other transaction managers
• Challenge • Performs prepare, commit, and abort calls for subtransactions
• Handle machine, software, & network failures while preserving transaction
integrity • Every subtransaction must agree to commit changes before the
transaction can complete
Transactional Commits are Consensus Two-Phase Commit Protocol
• Remember consensus?
• Agree on a value proposed by at least one process
• The coordinator proposes to commit a transaction
• Participants will agree ⇒ all participants then commit
• Participants will not agree ⇒ all participants then abort
• Here, we need unanimous agreement to commit
• Goal:
• Reliably agree to commit or abort a collection of sub-transactions
• All processes in the transaction will agree to commit or abort
• One transaction manager is elected as a coordinator – the rest are
participants
• Assume:
• write-ahead log in stable storage
• No system dies forever
• Systems can always communicate with each other
A single “no” response means that we will have to abort the transaction
• Failure during Phase 2 (commit/abort) • Another system can take over for the coordinator
• Coordinator dies • Could be a participant that detected a timeout to the coordinator
• Some participants may have been given commit/abort instructions
• Coordinator restarts; checks log; informs all participants of chosen action
• Recovery node needs to find the state of the protocol
• Participant dies • Contact ALL participants to see how they voted
• The participant may have died before or after getting the commit/abort request • If we get voting results from all participants
• Coordinator keeps trying to contact the participant with the request • We know that Phase 1 has completed
• Participant recovers; checks log; gets request from coordinator • If all participants voted to commit ⇒ send commit request
• If it committed/aborted, acknowledge the request • Otherwise send abort request
• Otherwise, process the commit/abort request and send back the acknowledgement • If ANY participant has not voted
• We know that Phase 1 has not completed
• Restart the protocol
• Biggest problem: it’s a blocking protocol • Same setup as the two-phase commit protocol:
• If the coordinator crashes, participants have no idea whether to commit or • Coordinator & Participants
abort
• Add timeouts to each phase that result in an abort
• A recovery coordinator helps in some cases
• A non-responding participant will also result in blocking • Propagate the result of the commit/abort vote to each participant
• When a participant gets a commit/abort message, it does not know if before telling them to act on it
every other participant was informed of the result • This will allow us to recover the state if any participant dies
Three-Phase Commit Protocol Three-Phase Commit Protocol
• Split the second phase of 2PC into two parts: • Phase 1: voting phase
• Coordinator sends canCommit? queries to participants & et responses
• 2a. “Precommit” (prepare to commit) phase
• Purpose: Find out if everyone agrees to commit
• Send Prepare message to all participants when it received a yes from all participants in • It the coordinator gets a timeout from any participant, or any NO replies are received
phase 1
• Send an abort to all participants
• Participants can prepare to commit but cannot do anything that cannot be undone • If a participant times out waiting for a request from the coordinator
• Participants reply with an acknowledgement • It aborts itself (assume coordinator crashed)
• Purpose: let every participant know the state of the result of the vote so that state can • Else continue to phase 2
be recovered if anyone dies
• Phase 2: Prepare to commit phase
• 2b. “Commit” phase (same as in 2PC) • Send a Prepare message to all participants. Get OK messages from all participants
• If coordinator gets ACKs for all “precommit” messages • Purpose: let all participants know the decision to commit
• It will send a commit message • If coordinator times out: assume participant crashed, send abort to all participants
• Else it will abort – send an abort message to all participants • The coordinator cannot count on every participant having received the Prepare message
• Phase 3: finalize
• Send commit messages to participants and get responses from all
• If participant times out: contact any other participant and move to that state (commit or
abort)
• If coordinator times out: that’s ok
• Result • Strategies:
• The result of this query will have 10,000 tuples, assuming that every 1. Transfer Employee and Department to site 3.
employee is related to a department.
• Total transfer bytes = 1,000,000 + 3500 = 1,003,500 bytes.
• Suppose each result tuple is 40 bytes long. The query is submitted at site 3
and the result is sent to this site. 2. Transfer Employee to site 2, execute join at site 2 and send
• Problem: Employee and Department relations are not present at site 3. the result to site 3.
• Query result size = 40 * 10,000 = 400,000 bytes. Total transfer
size = 400,000 + 1,000,000 = 1,400,000 bytes.
3. Transfer Department relation to site 1, execute the join at
site 1, and send the result to site 3.
• Total bytes transferred = 400,000 + 3500 = 403,500 bytes.
• Optimization criteria: minimizing data transfer.
• Preferred approach: strategy 3.
Query Processing in Distributed Databases Query Processing in Distributed Databases
• Consider the query • The result of this query will have 100 tuples, assuming
• Q’: For each department, retrieve the department name and the name of that every department has a manager, the execution
the department manager strategies are:
• Relational Algebra expression: 1. Transfer Employee and Department to the result site and
• Fname,Lname,Dname (Employee Mgrssn = SSN Department) perform the join at site 3.
• Total bytes transferred = 1,000,000 + 3500 = 1,003,500 bytes.
2. Transfer Employee to site 2, execute join at site 2 and send
the result to site 3. Query result size = 40 * 100 = 4000
bytes.
• Total transfer size = 4000 + 1,000,000 = 1,004,000 bytes.
3. Transfer Department relation to site 1, execute join at site 1
and send the result to site 3.
• Total transfer size = 4000 + 3500 = 7500 bytes.
• Preferred strategy: Choose strategy 3.