0% found this document useful (0 votes)
10 views40 pages

Unit-5 + DDBMS

distrubuted system

Uploaded by

dhruvbansal0035
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views40 pages

Unit-5 + DDBMS

distrubuted system

Uploaded by

dhruvbansal0035
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Vijay Katta 12/11/2024

UNIT-5
DDBS

Vijay Katta, Hindustan College of Sci & Tech, Mathura 1

Outline

• Distributed Database Management system (


DDBMS )
• Concurrency Control Models (CC)
• Concurrency Control Protocols
• Deadlock Management in DDBMS

Vijay Katta, Hindustan College of Sci & Tech, Mathura 2

Vijay Katta 1
Vijay Katta 12/11/2024

Introduction
• Concurrency control is the activity of coordinating
concurrent accesses to a database in a multi-user
database management system (DBMS)

• Several problems
1. The lost update problem.
2. The temporary update problem
3. The incorrect summary problem

• Serializability Theory.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 3

Distributed Database Management


System (DDBMS)
• A collection of multiple, logically interrelated
databases distributed over a computer network.

• A distributed Database Management system is


as the software system that permits the
management of the distributed database and
make the distribution transparent to the users.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 4

Vijay Katta 2
Vijay Katta 12/11/2024

Architecture of
DDBS

Vijay Katta, Hindustan College of Sci & Tech, Mathura 5

Architecture:

The architecture of a system defines its structure:

• – the components of the system are identified;


• – the function of each component is specified;
• – the interrelationships and interactions among
the components are defined.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 6

Vijay Katta 3
Vijay Katta 12/11/2024

Architectural Models for


DDBMS

• Autonomy(A) : Controller
• 1 – Tight Integration
• 2 – Semi-autonomous System
• 3 - Isolation
• Heterogeneity(H):
• 1 – Homogeneous
• 2 - Heterogeneous
• Distribution(D): Data Management
• 1 – No Distribution
• 2 – Client – server Architecture
• 3 – Peer-to-peer Architecture

Vijay Katta, Hindustan College of Sci & Tech, Mathura 7

Autonomy(A) : Controller

• Refers to the distribution of control (not of data) and


indicates the degree to which individual DBMSs can
operate independently.

• Distribution of control

• Degree of independentance

Vijay Katta, Hindustan College of Sci & Tech, Mathura 8

Vijay Katta 4
Vijay Katta 12/11/2024

Autonomy(A) :

Tight integration:

• a single-image of the entire database is available to


any user who wants to share the information (which
may reside in multiple DBs);
• realized such that one data manager is in control of
the processing of each user request.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 9

Autonomy(A) :
• Semiautonomous systems:

individual DBMSs can operate independently, but have


decided to participate in a federation to make some of their
local data sharable.

• Total isolation:
the individual systems are stand-alone DBMSs, which
know neither of the existence of other DBMSs nor how to
communicate with them; there is no global control.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 10

Vijay Katta 5
Vijay Katta 12/11/2024

Autonomy(A) :

• – Design autonomy: each individual DBMS is free


to use the data models and transaction management
techniques that it prefers.
• – Communication autonomy: each individual
DBMS is free to decide what information to provide
to the other DBMSs
• – Execution autonomy: each individual DBMS can
execute the transactions that are submitted to it in
any way that it wants to.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 11

Distribution:
• Distribution: Refers to the physical distribution of data over multiple
sites.
1. – No distribution: No distribution of data at all
2. – Client/Server distribution:
Data are concentrated on the server, while clients
provide
application environment/user interface. First
attempt to
distribution
3. – Peer-to-peer distribution (also called full distribution):
∗ No distinction between client and server machine
∗ Each machine has full DBMS functionality
Vijay Katta, Hindustan College of Sci & Tech, Mathura 12

Vijay Katta 6
Vijay Katta 12/11/2024

Heterogeneity:

• Heterogeneity: Refers to heterogeneity of the


components at various levels
1. – hardware
2. – communications
3. – operating system
4. – DB components (e.g., data model, query language,
transaction management algorithms)

Vijay Katta, Hindustan College of Sci & Tech, Mathura 13

Heterogeneous DDBMS

– Sites may run different DBMS products, with possibly


different underlying data models

– This occurs when sites have implemented their own


databases first, and integration is considered later

– Translations are required to allow for different hardware


and/or different DBMS products

– Typical solution is to use gateways

Vijay Katta, Hindustan College of Sci & Tech, Mathura 14

Vijay Katta 7
Vijay Katta 12/11/2024

Homogeneous DDBMS

• Homogeneous DDBMS
– All sites use same DBMS product
– It is much easier to design and manage
– The approach provides incremental growth and
allows increased performance

Vijay Katta, Hindustan College of Sci & Tech, Mathura 15

Issues in DDBMS

• Data Planning
• Query Optimization and Decomposition
• Distributed Transaction Management
• Fault Tolerance and Reliability
• Networking

Vijay Katta, Hindustan College of Sci & Tech, Mathura 16

Vijay Katta 8
Vijay Katta 12/11/2024

Transactions

Vijay Katta, Hindustan College of Sci & Tech, Mathura 17

Transactions

• A transaction is a logical unit of work constituted by


one or more SQL statements executed by a single user.

• A transaction begins with the user's first executable


SQL statement and ends when it is committed or rolled
back by that user.

• A Transaction defines a sequence of server


operations that is guaranteed to be atomic in the
presence of multiple clients and server crash.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 18

Vijay Katta 9
Vijay Katta 12/11/2024

Transcation state in DBMS

Vijay Katta, Hindustan College of Sci & Tech, Mathura 19

Remote Transaction
• A remote transaction contains one or more
remote statements, all of which reference single
remote node.
• e.g.
UPDATE [email protected]_auto.com SET loc = 'NEW
YORK'
WHERE deptno = 10;

UPDATE [email protected]_auto.com SET deptno = 11


WHERE deptno = 10;
COMMIT;

Vijay Katta, Hindustan College of Sci & Tech, Mathura 20

Vijay Katta 10
Vijay Katta 12/11/2024

Distributed Transaction

• A distributed transaction is a transaction that


includes one or more statements that, individually or
as a group, update data on two or more distinct
nodes of a distributed database.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 21

Distributed Transaction

Vijay Katta, Hindustan College of Sci & Tech, Mathura 22

Vijay Katta 11
Vijay Katta 12/11/2024

ACID property

Vijay Katta, Hindustan College of Sci & Tech, Mathura 23

ACID Properties
(Atomicity: )

• Atomicity: Though a transaction involves several low level


operations but this property states that a transaction must be
treated as an atomic unit, that is, either all of its operations are
executed or none. There must be no state in database where
the transaction is left partially completed.

• States should be defined either before the execution of the


transaction or after the execution/abortion/failure of the
transaction.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 24

Vijay Katta 12
Vijay Katta 12/11/2024

ACID Properties
(Consistency: )

• Consistency: This property states that after the transaction


is finished, its database must remain in a consistent state.

• There must not be any possibility that some data is


incorrectly affected by the execution of transaction.

• If the database was in a consistent state before the execution of


the transaction, it must remain in consistent state after the
execution of the transaction.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 25

ACID Properties
(Isolation: )

• Isolation: In a database system where more than one


transaction are being executed simultaneously and in
parallel, the property of isolation states that all the
transactions will be carried out and executed as
if it is the only transaction in the system.

• No transaction will affect the existence of any


other transaction.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 26

Vijay Katta 13
Vijay Katta 12/11/2024

ACID Properties
(Durability: )
• Durability: This property states that in any case all
updates made on the database will persist even if the
system fails and restarts.
• If a transaction writes or updates some data in
database and commits that data will always be there
in the database.
• If the transaction commits but data is not written on the
disk and the system fails, that data will be updated once
the system comes up.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 27

Serializability
• When more than one transaction is executed by the operating system in a
multiprogramming environment, there are possibilities that instructions of one
transactions are interleaved with some other transaction.
• Schedule:
• A chronological execution sequence of transaction is called schedule.
• A schedule can have many transactions in it, each comprising of number of
instructions/tasks.
• Serial Schedule:
• A schedule in which transactions are aligned in such a way that one transaction
is executed first.
• When the first transaction completes its cycle then next transaction is executed.
Transactions are ordered one after other. This type of schedule is called serial schedule
as transactions are executed in a serial manner.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 28

Vijay Katta 14
Vijay Katta 12/11/2024

Transactions and Concurrency


Control cont..
• All concurrency control protocols are based on
serial equivalence and are derived from rules of
conflicting operations.
• Locks used to order transactions that access the same object
according to request order.
• Optimistic concurrency control allows transactions to
proceed until they are ready to commit, where upon a check
is made to see any conflicting operation on objects.
• Timestamp ordering uses timestamps to order transactions
that access the same object according to their starting time.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 29

Read and write operation


conflict rules

Vijay Katta, Hindustan College of Sci & Tech, Mathura 30

Vijay Katta 15
Vijay Katta 12/11/2024

Transactions & Transaction


Management

• ACID Property is still must be notified in DDBMS

• Transaction structures :
Flat Nested
Begin_transaction
Begin_transaction Begin_transaction T1
T1(); Begin_transaction T2
T2(); …… T3(); ……
End_transaction T2
End_transaction
End_transaction T1
End_transaction
Vijay Katta, Hindustan College of Sci & Tech, Mathura 31

Nested transactions

• Transactions may themselves contain subtransactions (nested transactions).


• A top-level transaction may fork off children that run in parallel with each
other. Any or all of these may execute subtransactions.
• The problem with this is that the subtransactions may commit but, later in time,
the parent may abort. Now we find ourselves having to undo the committed
transactions.
• The level of nesting (and hence the level of undoing) may be arbitrarily deep. For
this to work, conceptually, each subtransaction must be given a private copy of
every object it may manipulate. On commit, the private copy displaces its
parent's universe (which may be a private copy of that parent's parent).

Vijay Katta, Hindustan College of Sci & Tech, Mathura 32

Vijay Katta 16
Vijay Katta 12/11/2024

Transaction Processing

Vijay Katta, Hindustan College of Sci & Tech, Mathura 33

Distributed Transactions

• Transaction may access data at several sites.


• Each site has a local transaction manager responsible
for:
• Maintaining a log for recovery purposes
• Participating in coordinating the concurrent execution of the
transactions executing at that site.
• Each site has a transaction coordinator (scheduler),
which is responsible for:
• Starting the execution of transactions that originate at the site.
• Distributing subtransactions at appropriate sites for execution.
• Coordinating the termination of each transaction that originates
at the site, which may result in the transaction being committed at
all sites or aborted at all sites.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 34

Vijay Katta 17
Vijay Katta 12/11/2024

Transaction System
Architecture

Vijay Katta, Hindustan College of Sci & Tech, Mathura 35

Centralized Transaction
Execution

Vijay Katta, Hindustan College of Sci & Tech, Mathura 36

Vijay Katta 18
Vijay Katta 12/11/2024

Distributed Transaction Execution


➢ Transaction Manager
➢ Data Manager
➢ Scheduler

DDBS Architecture

Processing Operation
Vijay Katta, Hindustan College of Sci & Tech, Mathura 37

Anomaly in DDBMS in Absence of


Concurrency Control

Vijay Katta, Hindustan College of Sci & Tech, Mathura 38

Vijay Katta 19
Vijay Katta 12/11/2024

Scheduling
Algorithms

Vijay Katta, Hindustan College of Sci & Tech, Mathura 39

Scheduling Algorithms

• There are 3 basic methods for transaction


concurrency control.

➢Locking (two phase locking - 2PL).


➢Timestamp ordering
➢Optimistic Concurrency Control
➢Hybrid
Vijay Katta, Hindustan College of Sci & Tech, Mathura 40

Vijay Katta 20
Vijay Katta 12/11/2024

Transactions and Concurrency


Control cont..
• All concurrency control protocols are based on serial
equivalence and are derived from rules of conflicting
operations.
• Locks used to order transactions that access the same object
according to request order.
• Optimistic concurrency control allows transactions to
proceed until they are ready to commit, where upon a check is
made to see any conflicting operation on objects.
• Timestamp ordering uses timestamps to order transactions
that access the same object according to their starting time.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 41

Conflicting Operations (Same


concept)
• When we say a pair of operations conflicts, we mean
that their combined effect depends on the order in
which they are executed. E.g. read and write

• Three ways to ensure serializability:


• Locking
• Timestamp ordering
• Optimistic concurrency control

Vijay Katta, Hindustan College of Sci & Tech, Mathura 42

Vijay Katta 21
Vijay Katta 12/11/2024

A lock
• A lock is a system object associated with a shared
resource such as a data item of an elementary type, a
row in a database, or a page of memory.
• In a database, a lock on a database object (a data-access lock)
may need to be acquired by a transaction before accessing the
object.
• Correct use of locks prevents undesired, incorrect or
inconsistent operations on shared resources by other
concurrent transactions.
• When a database object with an existing lock acquired by one
transaction needs to be accessed by another transaction, the
existing lock for the object and the type of the intended
43
access are checked by the system.
Vijay Katta, Hindustan College of Sci & Tech, Mathura

Single-Lock-Manager
Approach
• System maintains a single lock manager that resides in a
single chosen site, say Si
• When a transaction needs to lock a data item, it sends a lock
request to Si and lock manager determines whether the lock can be
granted immediately
• If yes, lock manager sends a message to the site which initiated the
request
• If no, request is delayed until it can be granted, at which time a message
is sent to the initiating site
44

Vijay Katta, Hindustan College of Sci & Tech, Mathura

Vijay Katta 22
Vijay Katta 12/11/2024

Single-Lock-Manager
Approach (Cont.)
• The transaction can read the data item from any one of the sites at which
a replica of the data item resides.
• Writes must be performed on all replicas of a data item
• Advantages of scheme:
• Simple implementation
• Simple deadlock handling
• Disadvantages of scheme are:
• Bottleneck: lock manager site becomes a bottleneck
• Vulnerability: system is vulnerable to lock manager site failure.
45

Vijay Katta, Hindustan College of Sci & Tech, Mathura

Distributed Lock Manager


• In this approach, functionality of locking is implemented by lock managers at
each site
• Lock managers control access to local data items. But special protocols
may be used for replicas
• Advantage: work is distributed and can be made robust to failures
• Disadvantage: deadlock detection is more complicated
• Lock managers cooperate for deadlock detection
• Several variants of this approach
• Primary copy
• Majority protocol
• Biased protocol 46

• Quorum consensus
Vijay Katta, Hindustan College of Sci & Tech, Mathura

Vijay Katta 23
Vijay Katta 12/11/2024

Primary Copy
• Choose one replica of data item to be the primary copy.
• Site containing the replica is called the primary site for that data
item
• Different data items can have different primary sites
• When a transaction needs to lock a data item Q, it requests a
lock at the primary site of Q.
• Implicitly gets lock on all replicas of the data item
• Benefit
• Concurrency control for replicated data handled similarly to unreplicated
data - simple implementation.
• Drawback
• If the primary site of Q fails, Q is inaccessible even though other sites
containing a replica may be accessible.
47

Vijay Katta, Hindustan College of Sci & Tech, Mathura

Locking Protocols
• Majority Protocol
➢Local lock manager at each site administers lock and
unlock requests for data items stored at that site.
❑In case of unreplicated data
When a transaction wishes to lock an unreplicated
data item Q residing at site Si , a message is sent to Si
‘s lock manager.
• If Q is locked in an incompatible mode, then the request is delayed until
it can be granted.
• When the lock request can be granted, the lock manager sends a
message back to the initiator indicating that the lock request has been
granted.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 48

Vijay Katta 24
Vijay Katta 12/11/2024

Majority Protocol (Cont.)


• In case of replicated data
• If Q is replicated at n sites, then a lock request message must be sent to
more than half of the n sites in which Q is stored.
• The transaction does not operate on Q until it has obtained a lock on a
majority of the replicas of Q.
• When writing the data item, transaction performs writes on all replicas.
• Benefit
• Can be used even when some sites are unavailable

• Drawback
• Potential for deadlock even with single item - e.g., each of 3 transactions
may have locks on 1/3rd of the replicas of a data.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 49

Biased Protocol
• Local lock manager at each site as in majority protocol, however,
requests for shared locks are handled differently than requests for
exclusive locks.
• Shared locks.(Read Lock) When a transaction needs to lock data
item Q, it simply requests a lock on Q from the lock manager at one
site containing a replica of Q.
• Exclusive locks.(Write lock)When transaction needs to lock data
item Q, it requests a lock on Q from the lock manager at all sites
containing a replica of Q.
• Advantage - imposes less overhead on read operations.
• Disadvantage - additional overhead on writes
Vijay Katta, Hindustan College of Sci & Tech, Mathura 50

Vijay Katta 25
Vijay Katta 12/11/2024

2 Phase Locking (2PL)

• By the 2PL protocol locks are applied and removed


in two phases:

- Expanding phase: locks are acquired and no locks


are released.

- Shrinking phase: locks are released and no locks are


acquired.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 51

2 Phase Locking (2PL)


➢Centralized 2PL.
➢Primary copy 2PL.
➢Distributed 2PL.
➢Voting 2PL.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 52

Vijay Katta 26
Vijay Katta 12/11/2024

2 Phase Locking (2PL)

Vijay Katta, Hindustan College of Sci & Tech, Mathura 53

2 Phase Locking (2PL)

• Two types of locks are utilized by the basic protocol:

• Shared and Exclusive locks.

• Refinements of the basic protocol may utilize more


lock types.
• Using locks that block processes, 2PL may be subject
to deadlocks that result from the mutual blocking of
two or more transactions.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 54

Vijay Katta 27
Vijay Katta 12/11/2024

Centralized 2PL

Vijay Katta, Hindustan College of Sci & Tech, Mathura 55

Distributed 2PL

Vijay Katta, Hindustan College of Sci & Tech, Mathura 56

Vijay Katta 28
Vijay Katta 12/11/2024

Timestamping
• Timestamp based concurrency-control protocols can be used
in distributed systems
• Each transaction must be given a unique timestamp
• Main problem: how to generate a timestamp in a distributed
fashion
• Each site generates a unique local timestamp using either a logical
counter or the local clock.
• Global unique timestamp is obtained by concatenating the unique
local timestamp with the unique identifier.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 57

Timestamping (Cont.)
• A site with a slow clock will assign smaller
timestamps
• Still logically correct: serializability not affected
• But: “disadvantages” transactions
• To fix this problem
• Define within each site Si a logical clock (LCi), which generates the unique
local timestamp
• Require that Si advance its logical clock whenever a request is received from a
transaction Ti with timestamp < x,y> and x is greater that the current value of
LCi.
• In this case, site Si advances its logical clock to the value x + 1.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 58

Vijay Katta 29
Vijay Katta 12/11/2024

Timestamp Ordering

• Timestamp (TS): a number associated with


each transaction
• Not necessarily real time
• Can be assigned by a logical counter
• Unique for each transaction
• Should be assigned in an increasing order for each
new transaction

Vijay Katta, Hindustan College of Sci & Tech, Mathura 59

Timestamp Ordering

• Timestamps associated with each database item


• Read timestamp (RTS) : the largest timestamp of the transactions
that read the item so far
• Write timestamp (WTS) : the largest timestamp of the transactions
that write the item so far

• After each successful read/write of object O by


transaction T the timestamp is updated
• RTS(O) = max(RTS(O), TS(T))
• WTS(O) = max(WTS(O), TS(T))
Vijay Katta, Hindustan College of Sci & Tech, Mathura 60

Vijay Katta 30
Vijay Katta 12/11/2024

Timestamp Ordering

• Given a transaction T
• If T wants to read(X)
• If TS(T) < WTS(X) then read is rejected, T has to
abort
• Else, read is accepted and RTS(X) updated.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 61

Timestamp Ordering

• If T wants to write(X)
• If TS(T) < RTS(X) then write is rejected, T has to
abort
• If TS(T) < WTS(X) then write is rejected, T has to
abort
• Else, allow the write, and update WTS(X)
accordingly

Vijay Katta, Hindustan College of Sci & Tech, Mathura 62

Vijay Katta 31
Vijay Katta 12/11/2024

Hybrid

• Three basic technique and each can be used for rw or ww scheduling or


both.
• Schedulers can be centralized or distributed.
• Replicated data can be handled in three ways (Do Nothing, Primary Copy,
Voting).
• System R*
Use a 2PL scheduler for rw and ww synchronization. The schedulers are
distributed at the DM's. Replication is handled by the do nothing approach.
• Distributed INGRES
INGRES uses primary copy for replication.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 63

New Approaches to Concurrency


• Total Ordering Control
• Total ordering in networking terms describes the property of a network
guaranteeing that all messages are delivered in the same order across all
destinations.
• In combination with the concept of transactions, one can make use of
this property to ensure that transactions are received in the same order at
all sites — called the ORDER CC technique.
• Algorithm
➢ Each transaction is initiated by sending its reads and write predeclares to
the corresponding schedulers as a single atomic action in totally ordered
fashion.
➢ Each scheduler stores the received operation requests in a FIFO-type
queue.
➢ If read is at the head of the queue, it is immediately executed.
➢ transaction can now issue the write requests in accordance with the
previously given predeclares.
➢ Upon commit, the committed values are send in non-ordered fashion to
the schedulers, which re-place the corresponding predeclare statements in
the queue with the received committed writes.
64

Vijay Katta, Hindustan College of Sci & Tech, Mathura

Vijay Katta 32
Vijay Katta 12/11/2024

Atomic Commitment

• Transaction commit
- consistent termination of transaction is an issue:

• atomic commitment protocols (ACP)


- an important goal of ACP is to minimize the effect of
failures on operational sites’ ability to continue.


65

Vijay Katta, Hindustan College of Sci & Tech, Mathura

Properties of ACP
• AC1: consistent termination
- all sites that reach a decision reach the same one
• AC2: irrevocable decision
- a site cannot reverse its decision after it is made
• AC3: unanimous consent
- everyone must agree before anyone can commit
• AC4: exclusion of trivial protocols
- if no failure and all sites vote yes, decision must be commit
- avoids useless protocol where everyone always decides to abort
• AC5: finite time in decision making
- if failures can be tolerated by ACP, all sites eventually reach a
decision in a finite time

Vijay Katta, Hindustan College of Sci & Tech, Mathura 66

Vijay Katta 33
Vijay Katta 12/11/2024

Atomic Commit Protocols


• Irreversible decision
- both commit and abort are irreversible
- if failure occurs before the commit point, then upon recovery, it
must be aborted
- in distributed databases, the task of ACP is to enforce global
atomicity --- unanimity
• Why difficult?
- in the absence of failures, unanimous consensus protocol is rather
easy to achieve
- challenge is to find protocols ensuring atomicity in various types of
failure situations
• Types of failures
- site (node) failures
- communication link failures
Vijay Katta, Hindustan College of Sci & Tech, Mathura 67

Two-Phase Commit Protocol


• Two-Phase Commit Protocol

• Protocol for the coordinator:

Phase 1: send "start transaction" message to all participants


wait for votes from each participant

• Phase 2: if all the votes are YES, then send COMMIT to all
participants and commit the transaction else send ABORT
and abort the transaction

Vijay Katta, Hindustan College of Sci & Tech, Mathura 68

Vijay Katta 34
Vijay Katta 12/11/2024

Two-Phase Commit Protocol

• Protocol for the participants:


Phase 1: receive "start T" from the coordinator
execute subtransaction and send vote to coordinator
YES to commit
NO to abort
wait for the final decision from coordinator
• Phase 2: receive either COMMIT or ABORT and
process it
Vijay Katta, Hindustan College of Sci & Tech, Mathura 69

Coordinator Selection

Vijay Katta, Hindustan College of Sci & Tech, Mathura 70

Vijay Katta 35
Vijay Katta 12/11/2024

Coordinator Selection
• Backup coordinators
• Site which maintains enough information locally to assume the role of
coordinator if the actual coordinator fails
• Executes the same algorithms and maintains the same internal state
information as the actual coordinator fails executes state information
as the actual coordinator
• Allows fast recovery from coordinator failure but involves overhead
during normal processing.

• Election algorithms
• Used to elect a new coordinator in case of failures
• Example: Bully Algorithm - applicable to systems where every site
can send a message to every other site.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 71

Bully Algorithm
• If site Si sends a request that is not answered by the coordinator
within a time interval T, assume that the coordinator has failed Si
tries to elect itself as the new coordinator.
• Si sends an election message to every site with a higher
identification number, Si then waits for any of these processes to
answer within T.
• If no response within T, assume that all sites with number
greater than i have failed, Si elects itself the new coordinator.
• If answer is received Si begins time interval T’, waiting to receive
a message that a site with a higher identification number has
been elected.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 72

Vijay Katta 36
Vijay Katta 12/11/2024

Bully Algorithm (Cont.)


• If no message is sent within T’, assume the site with a
higher number has failed; Si restarts the algorithm.
• After a failed site recovers, it immediately begins
execution of the same algorithm.
• If there are no active sites with higher numbers, the
recovered site forces all processes with lower numbers
to let it become the coordinator site, even if there is a
currently active coordinator with a lower number.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 73

References
• ” A Secure Time-Stamp Based Concurrency Control Protocol For Distributed
Databases” Journal of Computer Science 3 (7): 561-565, 2007
• “Some Models of a Distributed Database Management System with Data
Replication", International Conference on Computer Systems and Technologies -
CompSysTech’07.
• “A Sophisticated introduction to distributed database concurrency control”, Harvard
University Cambridge, 1990.
• “Database system concepts”,from Silberschatz Mc-graw Hill 2001.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 74

Vijay Katta 37
Vijay Katta 12/11/2024

Data Models in DBMS


1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
6. Object-Relational Data Model
7. Flat Data Model
8. Semi-Structured Data Model
9. Associative Data Model
Back with solid fill

10. Context Data Model

Vijay Katta, Hindustan College of Sci & Tech, Mathura 75

DBMS software
• #1) SolarWinds • #10) MySQL
#22) Couchbase
Database • #11) FileMaker
#23) Toad
Performance • #12) Microsoft

Analyzer
Access #24) phpMyAdmin
• #13) Informix
#25) SQL Developer
• #2) DbVisualizer • #14) SQLite
#26) Sequel PRO
• #3) ManageEngine • #15) PostgreSQL
Applications Manager • #16) Amazon RDS
#27) Robomongo
• #4) Oracle RDBMS • #17) MongoDB
#28) Hadoop HDFS
#29) Cloudera
• #5) IBM DB2 • #18) Redis
• #30) MariaDB
• #6) Microsoft SQL #19) CouchDB

Server • #20) Neo4j #31) Informix


• #7) SAP Sybase ASE
• #21) OrientDB Dynamic Servers
#32) 4D (4th
• #8) Teradata
Dimension)
• #9) ADABAS
#33) Altibase

Vijay Katta, Hindustan College of Sci & Tech, Mathura 76

Vijay Katta 38
Vijay Katta 12/11/2024

List of Query Languages


1. SQL (Structured Query Language): 6. DAX (Data Analysis Expressions):
1. Used for relational databases like MySQL,
PostgreSQL, Oracle, SQL Server. 1. Used for formulas and expressions in Microsoft
2. NoSQL Query Languages:
Power BI, Power Pivot, and SQL Server Analysis
Services.
1. These vary depending on the specific NoSQL
database, such as: 7. GraphQL:
1. MongoDB Query Language: Used for
MongoDB. 1. A query language for APIs, designed to request
2. CQL (Cassandra Query Language): Used the specific data needed.
for Apache Cassandra.
8. Elasticsearch Query DSL:
3. N1QL (N1 Query Language): Used for
Couchbase. 1. Used for querying Elasticsearch, a distributed
3. SPARQL (SPARQL Protocol and RDF Query Language): search and analytics engine.
1. Used for querying RDF (Resource Description 9. LINQ (Language Integrated Query):
Framework) data, commonly associated with
semantic web technologies.
1. Used in .NET languages like C# and VB.NET for
4. XQuery: querying collections, databases, XML, and more.
1. Used for querying XML data.
10. Cypher:
5. MDX (Multidimensional Expressions):
1. Used for querying OLAP (Online Analytical
1. Used for querying graph databases, such as
Processing) databases. Neo4j.

Vijay Katta, Hindustan College of Sci & Tech, Mathura 77

Transaction Management Algorithms


1. ACID Properties:
1. While not a specific algorithm, the ACID properties (Atomicity, Consistency,
Isolation, Durability) represent a set of principles that guide transaction
management to ensure reliability.
2. Two-Phase Commit (2PC):
1. A protocol ensuring distributed transaction atomicity across multiple
databases or resources. It involves a coordinator and participants and
operates in two phases: prepare and commit.
3. Three-Phase Commit (3PC):
1. An extension of the two-phase commit protocol that adds a third phase
called "pre-commit." This phase reduces the chances of blocking in certain
failure scenarios.
4. Optimistic Concurrency Control:
1. Allows transactions to proceed without locking resources. Conflicts are
resolved at the end of the transaction, and if conflicts occur, the transaction
is rolled back and retried.
5. Pessimistic Concurrency Control:
1. Involves locking resources during a transaction to prevent other transactions
from accessing the same resources concurrently. It ensures consistency but
Vijay Katta, Hindustan College of Sci & Tech, Mathura
can lead to performance issues due to contention. 78

Vijay Katta 39
Vijay Katta 12/11/2024

Transaction Management Algorithms


6. Timestamp-Based Concurrency Control:
1. Assigns a unique timestamp to each transaction and uses these timestamps
to determine the order of execution, preventing conflicts and ensuring
isolation.
7. Snapshot Isolation:
1. Each transaction sees a snapshot of the database at the start of its
execution, preventing it from seeing changes made by other transactions
during its execution.
8. Multi-Version Concurrency Control (MVCC):
1. Similar to snapshot isolation, MVCC allows multiple versions of data to
coexist, and each transaction sees a snapshot of the database as it existed at
the start of the transaction.
9. Read Committed Isolation Level:
1. A lower isolation level where a transaction can only read committed data. It
prevents dirty reads but allows non-repeatable reads and phantom reads.
10.Serializable Isolation Level:
1. The highest isolation level, ensuring that transactions appear to be executed
in a serial order, even in a concurrent environment. It prevents dirty reads,
non-repeatable reads, and phantom reads.
Vijay Katta, Hindustan College of Sci & Tech, Mathura 79

Thank You…

Any Questions…???

Vijay Katta, Hindustan College of Sci & Tech, Mathura 80

Vijay Katta 40

You might also like