Unit-5 + DDBMS
Unit-5 + DDBMS
UNIT-5
DDBS
Outline
Vijay Katta 1
Vijay Katta 12/11/2024
Introduction
• Concurrency control is the activity of coordinating
concurrent accesses to a database in a multi-user
database management system (DBMS)
• Several problems
1. The lost update problem.
2. The temporary update problem
3. The incorrect summary problem
• Serializability Theory.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 3
Vijay Katta 2
Vijay Katta 12/11/2024
Architecture of
DDBS
Architecture:
Vijay Katta 3
Vijay Katta 12/11/2024
• Autonomy(A) : Controller
• 1 – Tight Integration
• 2 – Semi-autonomous System
• 3 - Isolation
• Heterogeneity(H):
• 1 – Homogeneous
• 2 - Heterogeneous
• Distribution(D): Data Management
• 1 – No Distribution
• 2 – Client – server Architecture
• 3 – Peer-to-peer Architecture
Autonomy(A) : Controller
• Distribution of control
• Degree of independentance
Vijay Katta 4
Vijay Katta 12/11/2024
Autonomy(A) :
Tight integration:
Autonomy(A) :
• Semiautonomous systems:
• Total isolation:
the individual systems are stand-alone DBMSs, which
know neither of the existence of other DBMSs nor how to
communicate with them; there is no global control.
Vijay Katta 5
Vijay Katta 12/11/2024
Autonomy(A) :
Distribution:
• Distribution: Refers to the physical distribution of data over multiple
sites.
1. – No distribution: No distribution of data at all
2. – Client/Server distribution:
Data are concentrated on the server, while clients
provide
application environment/user interface. First
attempt to
distribution
3. – Peer-to-peer distribution (also called full distribution):
∗ No distinction between client and server machine
∗ Each machine has full DBMS functionality
Vijay Katta, Hindustan College of Sci & Tech, Mathura 12
Vijay Katta 6
Vijay Katta 12/11/2024
Heterogeneity:
Heterogeneous DDBMS
Vijay Katta 7
Vijay Katta 12/11/2024
Homogeneous DDBMS
• Homogeneous DDBMS
– All sites use same DBMS product
– It is much easier to design and manage
– The approach provides incremental growth and
allows increased performance
Issues in DDBMS
• Data Planning
• Query Optimization and Decomposition
• Distributed Transaction Management
• Fault Tolerance and Reliability
• Networking
Vijay Katta 8
Vijay Katta 12/11/2024
Transactions
Transactions
Vijay Katta 9
Vijay Katta 12/11/2024
Remote Transaction
• A remote transaction contains one or more
remote statements, all of which reference single
remote node.
• e.g.
UPDATE [email protected]_auto.com SET loc = 'NEW
YORK'
WHERE deptno = 10;
Vijay Katta 10
Vijay Katta 12/11/2024
Distributed Transaction
Distributed Transaction
Vijay Katta 11
Vijay Katta 12/11/2024
ACID property
ACID Properties
(Atomicity: )
Vijay Katta 12
Vijay Katta 12/11/2024
ACID Properties
(Consistency: )
ACID Properties
(Isolation: )
Vijay Katta 13
Vijay Katta 12/11/2024
ACID Properties
(Durability: )
• Durability: This property states that in any case all
updates made on the database will persist even if the
system fails and restarts.
• If a transaction writes or updates some data in
database and commits that data will always be there
in the database.
• If the transaction commits but data is not written on the
disk and the system fails, that data will be updated once
the system comes up.
Serializability
• When more than one transaction is executed by the operating system in a
multiprogramming environment, there are possibilities that instructions of one
transactions are interleaved with some other transaction.
• Schedule:
• A chronological execution sequence of transaction is called schedule.
• A schedule can have many transactions in it, each comprising of number of
instructions/tasks.
• Serial Schedule:
• A schedule in which transactions are aligned in such a way that one transaction
is executed first.
• When the first transaction completes its cycle then next transaction is executed.
Transactions are ordered one after other. This type of schedule is called serial schedule
as transactions are executed in a serial manner.
Vijay Katta 14
Vijay Katta 12/11/2024
Vijay Katta 15
Vijay Katta 12/11/2024
• Transaction structures :
Flat Nested
Begin_transaction
Begin_transaction Begin_transaction T1
T1(); Begin_transaction T2
T2(); …… T3(); ……
End_transaction T2
End_transaction
End_transaction T1
End_transaction
Vijay Katta, Hindustan College of Sci & Tech, Mathura 31
Nested transactions
Vijay Katta 16
Vijay Katta 12/11/2024
Transaction Processing
Distributed Transactions
Vijay Katta 17
Vijay Katta 12/11/2024
Transaction System
Architecture
Centralized Transaction
Execution
Vijay Katta 18
Vijay Katta 12/11/2024
DDBS Architecture
Processing Operation
Vijay Katta, Hindustan College of Sci & Tech, Mathura 37
Vijay Katta 19
Vijay Katta 12/11/2024
Scheduling
Algorithms
Scheduling Algorithms
Vijay Katta 20
Vijay Katta 12/11/2024
Vijay Katta 21
Vijay Katta 12/11/2024
A lock
• A lock is a system object associated with a shared
resource such as a data item of an elementary type, a
row in a database, or a page of memory.
• In a database, a lock on a database object (a data-access lock)
may need to be acquired by a transaction before accessing the
object.
• Correct use of locks prevents undesired, incorrect or
inconsistent operations on shared resources by other
concurrent transactions.
• When a database object with an existing lock acquired by one
transaction needs to be accessed by another transaction, the
existing lock for the object and the type of the intended
43
access are checked by the system.
Vijay Katta, Hindustan College of Sci & Tech, Mathura
Single-Lock-Manager
Approach
• System maintains a single lock manager that resides in a
single chosen site, say Si
• When a transaction needs to lock a data item, it sends a lock
request to Si and lock manager determines whether the lock can be
granted immediately
• If yes, lock manager sends a message to the site which initiated the
request
• If no, request is delayed until it can be granted, at which time a message
is sent to the initiating site
44
Vijay Katta 22
Vijay Katta 12/11/2024
Single-Lock-Manager
Approach (Cont.)
• The transaction can read the data item from any one of the sites at which
a replica of the data item resides.
• Writes must be performed on all replicas of a data item
• Advantages of scheme:
• Simple implementation
• Simple deadlock handling
• Disadvantages of scheme are:
• Bottleneck: lock manager site becomes a bottleneck
• Vulnerability: system is vulnerable to lock manager site failure.
45
• Quorum consensus
Vijay Katta, Hindustan College of Sci & Tech, Mathura
Vijay Katta 23
Vijay Katta 12/11/2024
Primary Copy
• Choose one replica of data item to be the primary copy.
• Site containing the replica is called the primary site for that data
item
• Different data items can have different primary sites
• When a transaction needs to lock a data item Q, it requests a
lock at the primary site of Q.
• Implicitly gets lock on all replicas of the data item
• Benefit
• Concurrency control for replicated data handled similarly to unreplicated
data - simple implementation.
• Drawback
• If the primary site of Q fails, Q is inaccessible even though other sites
containing a replica may be accessible.
47
Locking Protocols
• Majority Protocol
➢Local lock manager at each site administers lock and
unlock requests for data items stored at that site.
❑In case of unreplicated data
When a transaction wishes to lock an unreplicated
data item Q residing at site Si , a message is sent to Si
‘s lock manager.
• If Q is locked in an incompatible mode, then the request is delayed until
it can be granted.
• When the lock request can be granted, the lock manager sends a
message back to the initiator indicating that the lock request has been
granted.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 48
Vijay Katta 24
Vijay Katta 12/11/2024
• Drawback
• Potential for deadlock even with single item - e.g., each of 3 transactions
may have locks on 1/3rd of the replicas of a data.
Biased Protocol
• Local lock manager at each site as in majority protocol, however,
requests for shared locks are handled differently than requests for
exclusive locks.
• Shared locks.(Read Lock) When a transaction needs to lock data
item Q, it simply requests a lock on Q from the lock manager at one
site containing a replica of Q.
• Exclusive locks.(Write lock)When transaction needs to lock data
item Q, it requests a lock on Q from the lock manager at all sites
containing a replica of Q.
• Advantage - imposes less overhead on read operations.
• Disadvantage - additional overhead on writes
Vijay Katta, Hindustan College of Sci & Tech, Mathura 50
Vijay Katta 25
Vijay Katta 12/11/2024
Vijay Katta 26
Vijay Katta 12/11/2024
Vijay Katta 27
Vijay Katta 12/11/2024
Centralized 2PL
Distributed 2PL
Vijay Katta 28
Vijay Katta 12/11/2024
Timestamping
• Timestamp based concurrency-control protocols can be used
in distributed systems
• Each transaction must be given a unique timestamp
• Main problem: how to generate a timestamp in a distributed
fashion
• Each site generates a unique local timestamp using either a logical
counter or the local clock.
• Global unique timestamp is obtained by concatenating the unique
local timestamp with the unique identifier.
Timestamping (Cont.)
• A site with a slow clock will assign smaller
timestamps
• Still logically correct: serializability not affected
• But: “disadvantages” transactions
• To fix this problem
• Define within each site Si a logical clock (LCi), which generates the unique
local timestamp
• Require that Si advance its logical clock whenever a request is received from a
transaction Ti with timestamp < x,y> and x is greater that the current value of
LCi.
• In this case, site Si advances its logical clock to the value x + 1.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 58
Vijay Katta 29
Vijay Katta 12/11/2024
Timestamp Ordering
Timestamp Ordering
Vijay Katta 30
Vijay Katta 12/11/2024
Timestamp Ordering
• Given a transaction T
• If T wants to read(X)
• If TS(T) < WTS(X) then read is rejected, T has to
abort
• Else, read is accepted and RTS(X) updated.
Timestamp Ordering
• If T wants to write(X)
• If TS(T) < RTS(X) then write is rejected, T has to
abort
• If TS(T) < WTS(X) then write is rejected, T has to
abort
• Else, allow the write, and update WTS(X)
accordingly
Vijay Katta 31
Vijay Katta 12/11/2024
Hybrid
Vijay Katta 32
Vijay Katta 12/11/2024
Atomic Commitment
• Transaction commit
- consistent termination of transaction is an issue:
•
65
Properties of ACP
• AC1: consistent termination
- all sites that reach a decision reach the same one
• AC2: irrevocable decision
- a site cannot reverse its decision after it is made
• AC3: unanimous consent
- everyone must agree before anyone can commit
• AC4: exclusion of trivial protocols
- if no failure and all sites vote yes, decision must be commit
- avoids useless protocol where everyone always decides to abort
• AC5: finite time in decision making
- if failures can be tolerated by ACP, all sites eventually reach a
decision in a finite time
Vijay Katta 33
Vijay Katta 12/11/2024
• Phase 2: if all the votes are YES, then send COMMIT to all
participants and commit the transaction else send ABORT
and abort the transaction
Vijay Katta 34
Vijay Katta 12/11/2024
Coordinator Selection
Vijay Katta 35
Vijay Katta 12/11/2024
Coordinator Selection
• Backup coordinators
• Site which maintains enough information locally to assume the role of
coordinator if the actual coordinator fails
• Executes the same algorithms and maintains the same internal state
information as the actual coordinator fails executes state information
as the actual coordinator
• Allows fast recovery from coordinator failure but involves overhead
during normal processing.
• Election algorithms
• Used to elect a new coordinator in case of failures
• Example: Bully Algorithm - applicable to systems where every site
can send a message to every other site.
Bully Algorithm
• If site Si sends a request that is not answered by the coordinator
within a time interval T, assume that the coordinator has failed Si
tries to elect itself as the new coordinator.
• Si sends an election message to every site with a higher
identification number, Si then waits for any of these processes to
answer within T.
• If no response within T, assume that all sites with number
greater than i have failed, Si elects itself the new coordinator.
• If answer is received Si begins time interval T’, waiting to receive
a message that a site with a higher identification number has
been elected.
Vijay Katta 36
Vijay Katta 12/11/2024
References
• ” A Secure Time-Stamp Based Concurrency Control Protocol For Distributed
Databases” Journal of Computer Science 3 (7): 561-565, 2007
• “Some Models of a Distributed Database Management System with Data
Replication", International Conference on Computer Systems and Technologies -
CompSysTech’07.
• “A Sophisticated introduction to distributed database concurrency control”, Harvard
University Cambridge, 1990.
• “Database system concepts”,from Silberschatz Mc-graw Hill 2001.
Vijay Katta 37
Vijay Katta 12/11/2024
DBMS software
• #1) SolarWinds • #10) MySQL
#22) Couchbase
Database • #11) FileMaker
#23) Toad
Performance • #12) Microsoft
Analyzer
Access #24) phpMyAdmin
• #13) Informix
#25) SQL Developer
• #2) DbVisualizer • #14) SQLite
#26) Sequel PRO
• #3) ManageEngine • #15) PostgreSQL
Applications Manager • #16) Amazon RDS
#27) Robomongo
• #4) Oracle RDBMS • #17) MongoDB
#28) Hadoop HDFS
#29) Cloudera
• #5) IBM DB2 • #18) Redis
• #30) MariaDB
• #6) Microsoft SQL #19) CouchDB
Vijay Katta 38
Vijay Katta 12/11/2024
Vijay Katta 39
Vijay Katta 12/11/2024
Thank You…
Any Questions…???
Vijay Katta 40