Lecturenotes Module-5 BCS403 Databasemanagementsystem

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

SCHEME - 2022

MoModule 5
BCS403-Database Management System

Concurrency Control in Databases

Two-Phase Locking Techniques for Concurrency Control


 Some of the main techniques used to control concurrent execution of transactions are
based on the concept of locking data items.
 A lock is a variable associated with a data item that describes the status of the item with
respect to possible operations that can be applied to it.
 Generally, there is one lock for each data item in the database. Locks are used as a
means of synchronizing the access by concurrent transactions to the database items.

Types of Locks and System Lock Tables


 Several types of locks are used in concurrency control. To introduce locking concepts
gradually, first we discuss binary locks, which are simple but are also too restrictive for
database concurrency control purposes and so are not used much.
 Then we discuss shared/exclusive locks—also known as read/write locks—which provide
more general locking capabilities and are used in database locking schemes.
 Describe an additional type of lock called a certify lock, and we show how it can be used
to improve performance of locking protocols.

Binary Locks

 A binary lock can have two states or values: locked and unlocked (or 1 and 0, for
simplicity). A distinct lock is associated with each database item X.
 Two operations, lock_item and unlock_item, are used with binary locking. A
transaction requests access to an item X by first issuing a lock_item(X) operation.
 If LOCK(X) = 1, the transaction is forced to wait. If LOCK(X) = 0, it is set to 1 (the
transaction locks the item) and the transaction is allowed to access item X.

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System
If the simple binary locking scheme described here is used, every transaction must obey the
following rules:

1. A transaction T must issue the operation lock_item(X) before any read_item(X) or


write_item(X) operations are performed in T.
2. A transaction T must issue the operation unlock_item(X) after all read_item(X) and
write_item(X) operations are completed in T.
3. A transaction T will not issue a lock_item(X) operation if it already holds the lock on
item X.
4. A transaction T will not issue an unlock_item(X) operation unless it already holds the
lock on item X.
 These rules can be enforced by the lock manager module of the DBMS. Between the
lock_item(X) and unlock_item(X) operations in transaction T, T is said to hold the lock on
item X.
 At most one transaction can hold the lock on a particular item. Thus no two transactions
can access the same item concurrently.

Shared/Exclusive (or Read/Write) Locks

 The preceding binary locking scheme is too restrictive for database items because at
most one transaction can hold a lock on a given item.
 For this purpose, a different type of lock, called a multiple-mode lock, is used. In this
scheme—called shared/exclusive or read/write locks—there are three locking
operations: read_lock(X), write_lock(X), and unlock(X).
 A lock associated with an item X, LOCK(X), now has three possible states: read-locked,
write-locked, or unlocked.
 A read-locked item is also called share-locked because other transactions are allowed to
read the item, whereas a write-locked item is called exclusive-locked because a single
transaction exclusively holds the lock on the item.

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

When we use the shared/exclusive locking scheme, the system must enforce the following
rules:

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

Conversion (Upgrading, Downgrading) of Locks.

 It is desirable to relax conditions 4 and 5 in the preceding list in order to allow lock
conversion; that is, a transaction that already holds a lock on item X is allowed under
certain conditions to convert the lock from one locked state to another.
 It is also possible for a transaction T to issue a write_lock(X) and then later to downgrade
the lock by issuing a read_lock(X) operation.
 Using binary locks or read/write locks in transactions, as described earlier, does not
guarantee serializability of schedules on its own.
 To guarantee serializability, we must follow an additional protocol concerning the
positioning of locking and unlocking operations in every transaction.
 The best-known protocol, two-phase locking, is described in the next section.

Guaranteeing Serializability by Two-Phase Locking


 A transaction is said to follow the two-phase locking protocol if all locking operations
(read_lock, write_lock) precede the first unlock operation in the transaction.
 Such a transaction can be divided into two phases: an expanding or growing (first)
phase, during which new locks on items can be acquired but none can be released; and a
shrinking (second) phase, during which existing locks can be released but no new locks
can be acquired.
 If lock conversion is allowed, then upgrading of locks (from read-locked to write-locked)
must be done during the expanding phase, and downgrading of locks (from write-locked
to read-locked) must be done in the shrinking phase.
 Although the two-phase locking protocol guarantees serializability (that is, every
schedule that is permitted is serializable), it does not permit all possible serializable
schedules (that is, some serializable schedules will be prohibited by the protocol).

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

Basic, Conservative, Strict, and Rigorous Two-Phase Locking

 There are a number of variations of two-phase locking (2PL). The technique just
described is known as basic 2PL.
 A variation known as conservative 2PL (or static 2PL) requires a transaction to lock all
the items it accesses before the transaction begins execution, by predeclaring its read-
set and write-set.
 Usually, the concurrency control subsystem itself is responsible for generating
the read_lock and write_lock requests.

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System
Dealing with Deadlock and Starvation

 Deadlock occurs when each transaction T in a set of two or more transactions is waiting
for some item that is locked by some other transaction T′ in the set.
 But because the other transaction is also waiting, it will never release the lock. A simple
example is shown in Figure 21.5(a), where the two transactions T1′ and T2′ are
deadlocked in a partial schedule; T1′ is in the waiting queue for X, which is locked by T2′,
whereas T2′ is in the waiting queue for Y, which is locked by T1′.

Deadlock Prevention Protocols

 One way to prevent deadlock is to use a deadlock prevention protocol. One deadlock
prevention protocol, which is used in conservative two-phase locking, requires that
every transaction lock all the items it needs in advance (which is generally not a practical
assumption)—if any of the items cannot be obtained, none of the items are locked.
 A number of other deadlock prevention schemes have been proposed that make a
decision about what to do with a transaction involved in a possible deadlock situation.
 Should it be blocked and made to wait or should it be aborted, or should the transaction
preempt and abort another transaction? Some of these techniques use the concept of
transaction timestamp TS(T′), which is a unique identifier assigned to each transaction.
 The timestamps are typically based on the order in which transactions are started;
hence, if transaction T1 starts before transaction T2, then TS(T1) < TS(T2).
 The rules followed by these schemes are:

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

Deadlock Detection

 An alternative approach to dealing with deadlock is deadlock detection, where the


system checks if a state of deadlock actually exists.
 This solution is attractive if we know there will be little interference among the
transactions—that is, if different transactions will rarely access the same items at the
same time.
 This can happen if the transactions are short and each transaction locks only a few
items, or if the transaction load is light.
 On the other hand, if transactions are long and each transaction uses many items, or if
the transaction load is heavy, it may be advantageous to use a deadlock prevention
scheme.
 A simple way to detect a state of deadlock is for the system to construct and maintain a
wait-for graph. One node is created in the wait-for graph for each transaction that is
currently executing.

Timeouts. Another simple scheme to deal with deadlock is the use of timeouts. This method
is practical because of its low overhead and simplicity.

Starvation. Another problem that may occur when we use locking is starvation, which
occurs when a transaction cannot proceed for an indefinite period of time while other
transactions in the system continue normally.

Concurrency Control Based on Timestamp Ordering


 The use of locking, combined with the 2PL protocol, guarantees serializability of
schedules.
 The serializable schedules produced by 2PL have their equivalent serial schedules
based on the order in which executing transactions lock the items they acquire.
 If a transaction needs an item that is already locked, it may be forced to wait until the
item is released.
 A different approach to concurrency control involves using transaction timestamps to
order transaction execution for an equivalent serial schedule.
 we discuss how serializability is enforced by ordering conflicting operations in different
transactions based on the transaction timestamps.
THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System
Timestamps

 Recall that a timestamp is a unique identifier created by the DBMS to identify a


transaction.
 Timestamps can be generated in several ways. One possibility is to use a counter that is
incremented each time its value is assigned to a transaction.
 The transaction timestamps are numbered 1, 2, 3, … in this scheme.
 Another way to implement timestamps is to use the current date/time value of the
system clock and ensure that no two timestamp values are generated during the same
tick of the clock.

The Timestamp Ordering Algorithm for Concurrency Control

 A schedule in which the transactions participate is then serializable, and the only
equivalent serial schedule permitted has the transactions in order of their timestamp
values. This is called timestamp ordering (TO).
 The algorithm allows interleaving of transaction operations, but it must ensure that for
each pair of conflicting operations in the schedule, the order in which the item is
accessed must follow the timestamp order.
 To do this, the algorithm associates with each database item X two timestamp (TS)
values:

Basic Timestamp Ordering (TO)

 Whenever some transaction T tries to issue a read_item(X) or a write_item(X) operation,


the basic TO algorithm compares the timestamp of T with read_TS(X) and write_TS(X) to
ensure that the timestamp order of transaction execution is not violated.
 If this order is violated, then transaction T is aborted and resubmitted to the system as a

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System
new transaction with a new timestamp.
 The concurrency control algorithm must check whether conflicting operations violate
the timestamp ordering in the following two cases:

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

Strict Timestamp Ordering (TO)

 A variation of basic TO called strict TO ensures that the schedules are both strict (for
easy recoverability) and (conflict) serializable.
 In this variation, a transaction T issues a read_item(X) or write_item(X) such that TS(T) >
write_TS(X) has its read or write operation delayed until the transaction T′ that wrote
the value of X (hence TS(T′) = write_TS(X)) has committed or aborted.

Thomas’s Write Rule

 A modification of the basic TO algorithm, known as Thomas’s write rule, does not
enforce conflict serializability, but it rejects fewer write operations by modifying the
checks for the write_item(X) operation as follows:

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

Multiversion Concurrency Control Techniques


 These protocols for concurrency control keep copies of the old values of a data item
when the item is updated (written); they are known as multiversion concurrency control
because several versions (values) of an item are kept by the system.
 An obvious drawback of multiversion techniques is that more storage is needed to
maintain multiple versions of the database items. In some cases, older versions can be
kept in a temporary store.
 Some database applications may require older versions to be kept to maintain a history
of the changes of data item values.
 In such cases, there is no additional storage penalty for multiversion techniques, since
older versions are already maintained.
 When a transaction writes an item, it writes a new version and the old version(s) of the
item is retained. Some multiversion concurrency control algorithms use the concept of
view serializability rather than conflict serializability

Multiversion Technique Based on Timestamp Ordering


In this method, several versions X1, X2, … , Xk of each data item X are maintained. For each
version, the value of version Xi and the following two timestamps associated with version Xi
are kept:

Whenever a transaction T is allowed to execute a write_item(X) operation, a new version


Xk+1 of item X is created, with both the write_TS(Xk+1) and the read_TS(Xk+1) set to TS(T).

Correspondingly, when a transaction T is allowed to read the value of version Xi , the value
of read_TS(Xi ) is set to the larger of the current read_TS(Xi ) and TS(T).

To ensure serializability, the following rules are used:

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

Multiversion Two-Phase Locking Using Certify Locks


 In this multiple-mode locking scheme, there are three locking modes for an item— read,
write, and certify—instead of just the two modes (read, write) discussed previously.
 Hence, the state of LOCK(X) for an item X can be one of read-locked, write-locked,
certify-locked, or unlocked.
 In the standard locking scheme, with only read and write locks a write lock is an
exclusive lock.
 We can describe the relationship between read and write locks in the standard scheme
by means of the lock compatibility table shown in Figure 21.6(a).

Validation (Optimistic) Techniques and Snapshot Isolation Concurrency


Control
 In all the concurrency control techniques we have discussed so far, a certain degree of
checking is done before a database operation can be executed.
 In optimistic concurrency control techniques, also known as validation or certification
techniques, no checking is done while the transaction is executing. Several concurrency
control methods are based on the validation technique.
 For example, in locking, a check is done to determine whether the item being accessed is
locked. In timestamp ordering, the transaction timestamp is checked against the read
and write timestamps of the item.
 The implementations of these concurrency control methods can utilize a combination of
the concepts from validation-based techniques and versioning techniques, as well as
THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System
utilizing timestamps.

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System
 Some of these methods may suffer from anomalies that can violate serializability, but
because they generally have lower overhead than 2PL, they have been implemented in
several relational DBMSs.

Validation-Based (Optimistic) Concurrency Control


 In this scheme, updates in the transaction are not applied directly to the database
items on disk until the transaction reaches its end and is validated.
 During transaction execution, all updates are applied to local copies of the data items
that are kept for the transaction.6At the end of transaction execution, a validation phase
checks whether any of the transaction’s updates violate serializability.
 There are three phases for this concurrency control protocol:

 The idea behind optimistic concurrency control is to do all the checks at once; hence,
transaction execution proceeds with a minimum of overhead until the validation phase
is reached. If there is little interference among transactions, most will be validated
successfully.
 The optimistic protocol we describe uses transaction timestamps and also requires that
the write_sets and read_sets of the transactions be kept by the system.
 The validation phase for Ti checks that, for each such transaction Tj that is either
recently committed or is in its validation phase, one of the following conditions holds:

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

Concurrency Control Based on Snapshot Isolation


 The basic definition of snapshot isolation is that a transaction sees the data items that it
reads based on the committed values of the items in the database snapshot (or
database state) when the transaction starts.
 Snapshot isolation will ensure that the phantom record problem does not occur, since
the database transaction, or, in some cases, the database statement, will only see the
records that were committed in the database at the time the transaction started.
 Any insertions, deletions, or updates that occur after the transaction starts will not be
seen by the transaction. In addition, snapshot isolation does not allow the problems of
dirty read and nonrepeatable read to occur.
 Although these anomalies are rare, they are very difficult to detect and may result in an
inconsistent or corrupted database.
 When writes do occur, the system will have to keep track of older versions of the
updated items in a temporary version store (sometimes known as tempstore), with the
timestamps of when the version was created.
 This is necessary so that a transaction that started before the item was written can still
read the value (version) of the item that was in the database snapshot when the
transaction started.
 Variations of this method have been used in several commercial and open source
DBMSs, including Oracle and PostGRES.
 If the users require guaranteed serializability, then the problems with anomalies that violate
serializability will have to be solved by the programmers/software engineers by analyzing the set
of transactions to determine which types of anomalies can occur, and adding checks that do not
permit these anomalies.
 This can place a burden on the software developers when compared to the DBMS enforcing
serializability in all cases.
 Variations of snapshot isolation (SI) techniques, known as serializable snapshot isolation (SSI),
have been proposed and implemented in some of the DBMSs that use SI as their primary
concurrency control method.

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

Granularity of Data Items and Multiple Granularity Locking

The particular choice of data item type can affect the performance of concurrency control and
recovery.

Granularity Level Considerations for Locking

 The size of data items is often called the data item granularity. Fine granularity refers to
small item sizes, whereas coarse granularity refers to large item sizes. Several tradeoffs
must be considered in choosing the data item size.
 If the data item size was a single record instead of a disk block, transaction S would be
able to proceed, because it would be locking a different data item (record).
 On the other hand, the smaller the data item size is, the more the number of items in
the database. Because every item is associated with a lock, the system will have a larger
number of active locks to be handled by the lock manager.
 For timestamps, storage is required for the read_TS and write_TS for each data item,
and there will be similar overhead for handling a large number of items.
 Given the above tradeoffs, an obvious question can be asked: What is the best item
size? The answer is that it depends on the types of transactions involved.
 If a typical transaction accesses a small number of records, it is advantageous to have
the data item granularity be one record.

Multiple Granularity Level Locking

 Since the best granularity size depends on the given transaction, it seems appropriate
that a database system should support multiple levels of granularity, where the
granularity level can be adjusted dynamically for various mixes of transactions.
 Figure 21.7 shows a simple granularity hierarchy with a database containing two files,
each file containing several disk pages, and each page containing several records.
THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System
 This can be used to illustrate a multiple granularity level 2PL protocol, with
shared/exclusive locking modes, where a lock can be requested at any level.

THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System

 Suppose transaction T1 wants to update all the records in file f1, and T1 requests and is
granted an exclusive lock for f1. Then all of f1’s pages (p11 through p1n)—and the
records contained on those pages—are locked in exclusive mode.
 To make multiple granularity level locking practical, additional types of locks, called
intention locks, are needed.
 There are three types of intention locks:

 The compatibility table of the three intention locks, and the actual shared and exclusive
THARANI R
Asst. Professor, Dept. of AI & ML
SCHEME - 2022

MoModule 5
BCS403-Database Management System
locks, is shown in Figure 21.8.
 In addition to the three types of intention locks, an appropriate locking protocol must be
used. The multiple granularity locking (MGL) protocol consists of the following rules:

THARANI R
Asst. Professor, Dept. of AI & ML

You might also like