Transactions and concurrency control mechanisms in database management system

Transaction Processing
Introduction
 The concept of transaction provides a mechanism for describing logical units of
database processing.
 Transaction processing systems are systems with large databases and
hundreds of concurrent users executing database transactions
 Examples
• airline reservations
• banking
• credit card processing,
• online retail purchasing,
• Stock markets, supermarket checkouts, and many other applications
 These systems require high availability and fast response time for hundreds of
concurrent users.
 A transaction is typically implemented by a computer program, which includes
database commands such as retrievals, insertions, deletions, and updates.

Topics Covered
 Basic concepts and theory of transaction processing systems
- definition, properties and characteristics
 Concurrency control problem
- occurs when multiple transactions submitted by various users
interfere with one another in a way that produces incorrect results

Introduction to Transaction Processing
 One criterion for classifying a database system is according to the
number of users who can use the system concurrently
Single-User versus Multiuser Systems
 A DBMS is
• single-user
- at most one user at a time can use the system
- Eg: Personal Computer System
• multiuser
- many users can use the system and hence access the database
concurrently
- Eg: Airline reservation database

 Concurrent access is possible because of Multiprogramming
 Multiprogramming can be achieved by:
• interleaved execution
• Parallel Processing
 Multiprogramming operating systems execute some commands
from one process, then suspend that process and execute some
commands from the next process, and so on
 A process is resumed at the point where it was suspended whenever
it gets its turn to use the CPU again
 Hence, concurrent execution of processes is actually interleaved, as
illustrated in Figure 21.1

 Figure 21.1, shows two processes, A and B, executing concurrently in an
interleaved fashion
 Interleaving keeps the CPU busy when a process requires an input or
output (I/O) operation, such as reading a block from disk
 The CPU is switched to execute another process rather than remaining
idle during I/O time
 Interleaving also prevents a long process from delaying other
processes.

 If the computer system has multiple hardware processors (CPUs),
parallel processing of multiple processes is possible, as illustrated by
processes C and D in Figure 21.1
 Most of the theory concerning concurrency control in databases is
developed in terms of interleaved concurrency
 In a multiuser DBMS, the stored data items are the primary resources
that may be accessed concurrently by interactive users or application
programs, which are constantly retrieving information from and
modifying the database.

Transactions, Database Items, Read and Write Operations, and
DBMS Buffers
Transaction
 an executing program that forms a logical unit of database processing
 It includes one or more DB access operations such as insertion,
deletion, modification or retrieval operation.
 It can be either embedded within an application program using begin
transaction and end transaction statements Or specified interactively
via a high level query language such as SQL
 Transaction which do not update database are known as read only
transactions.
 Transaction which do update database are known as read write
transactions.

 A database is basically represented as a collection of named data
items
 The size of a data item is called its granularity.
 A data item can be a database record, but it can also be a larger unit
such as a whole disk block, or even a smaller unit such as an
individual field (attribute) value of some record in the database
 Each data item has a unique name
 Basic DB access operations that a transaction can include are:
• read_item(X): Reads a DB item named X into a program variable.
• write_item(X): Writes the value of a program variable into the DB
item named X

Why Concurrency Control Is Needed
 Several problems can occur when concurrent transactions execute in an
uncontrolled manner
 Example:
• We consider an Airline reservation DB
• Each records is stored for an airline flight which includes Number of
reserved seats among other information.
• Types of problems we may encounter:
1. The Lost Update Problem
2. The Temporary Update (or Dirty Read) Problem
3. The Incorrect Summary Problem
4. The Unrepeatable Read Problem

 Transaction T1
• transfers N reservations from one flight whose number of reserved
seats is stored in the database item named X to another flight whose
number of reserved seats is stored in the database item named Y.
 Transaction T2
• reserves M seats on the first flight (X)

1. The Lost Update Problem
 occurs when two transactions that access the same DB items have their
operations interleaved in a way that makes the value of some DB item
incorrect
 Suppose that transactions T1 and T2 are submitted at approximately the
same time, and suppose that their operations are interleaved as shown in
Figure below
 Final value of item X is incorrect because T2 reads the value of X before T1
changes it in the database, and hence the updated value resulting from T1 is lost.

 For example:
X = 80 at the start (there were 80 reservations on the flight)
N = 5 (T1 transfers 5 seat reservations from the flight corresponding
to X to the flight corresponding to Y)
M = 4 (T2 reserves 4 seats on X)
The final result should be X = 79.
 The interleaving of operations shown in Figure is X = 84 because the
update in T1 that removed the five seats from X was lost.

2. The Temporary Update (or Dirty Read) Problem
• occurs when one transaction updates a database item and then the
transaction fails for some reason
• Meanwhile the updated item is accessed by another transaction
before it is changed back to its original value

3.The Incorrect Summary Problem
• If one transaction is calculating an aggregate summary function on a
number of db items while other transactions are updating some of
these items, the aggregate function may calculate some values before
they are updated and others after they are updated.

4.The Unrepeatable Read Problem
 Transaction T reads the same item twice and gets different values on
each read, since the item was modified by another transaction T`
between the two reads.
 for example, if during an airline reservation transaction, a customer
inquires about seat availability on several flights
 When the customer decides on a particular flight, the transaction then
reads the number of seats on that flight a second time before
completing the reservation, and it may end up reading a different
value for the item.

Why Recovery Is Needed
 Whenever a transaction is submitted to a DBMS for execution, the
system is responsible for making sure that either
1. All the operations in the transaction are completed successfully and
their effect is recorded permanently in the database or
2. The transaction does not have any effect on the database or any
other transactions
 In the first case, the transaction is said to be committed, whereas in the
second case, the transaction is aborted
 If a transaction fails after executing some of its operations but before
executing all of them, the operations already executed must be undone
and have no lasting effect.

Types of failures:
1.A computer failure (system crash):
• A hardware, software, or network error occurs in the computer
system during transaction execution
• Hardware crashes are usually media failures—for example, main
memory failure.
2. A transaction or system error:
• Some operation in the transaction may cause it to fail, such as integer
overflow or division by zero
• Also occur because of erroneous parameter values

3. Local errors or exception conditions detected by the transaction:
• During transaction execution, certain conditions may occur that
necessitate cancellation of the transaction
• For example, data for the transaction may not be found
4. Concurrency control enforcement:
• The concurrency control may decide to abort a transaction because it
violates serializability or several transactions are in a state of deadlock
5. Disk failure:
• Some disk blocks may lose their data because of a read or write
malfunction or because of a disk read/write head crash.
6. Physical problems and catastrophes:
• refers to an endless list of problems that includes power or air-
conditioning failure, fire, theft, overwriting disks or tapes by mistake

 Failures of types 1, 2, 3, and 4 are more common than those of types 5
or 6.
 Whenever a failure of type 1 through 4 occurs, the system must keep
sufficient information to quickly recover from the failure.
 Disk failure or other catastrophic failures of type 5 or 6 do not happen
frequently; if they do occur, recovery is a major task.

Transaction and System Concepts
Transaction States and Additional Operations
 A transaction is an atomic unit of work that should either be completed in
its entirety or not done at all. For recovery purposes, the system keeps
track of start of a transaction, termination, commit or aborts.
• BEGIN_TRANSACTION: marks the beginning of transaction execution
• READ or WRITE: specify read or write operations on the database items
that are executed as part of a transaction
• END_TRANSACTION: specifies that READ and WRITE transaction
operations have ended and marks the end of transaction execution
• COMMIT_TRANSACTION: signals a successful end of the transaction so
that any changes (updates) executed by the transaction can be safely
committed to the database and will not be undone
• ROLLBACK: signals that the transaction has ended unsuccessfully, so that
any changes or effects that the transaction may have applied to the
database must be undone

Figure: State transition diagram illustrating the states for transaction execution

 A transaction goes into active state immediately after it starts execution
and can execute read and write operations.
 When the transaction ends it moves to partially committed state.
 At this end additional checks are done to see if the transaction can be
committed or not. If these checks are successful the transaction is said
to have reached commit point and enters committed state. All the
changes are recorded permanently in the db.
 A transaction can go to the failed state if one of the checks fails or if the
transaction is aborted during its active state. The transaction may then
have to be rolled back to undo the effect of its write operation.
 Terminated state corresponds to the transaction leaving the system. All
the information about the transaction is removed from system tables.

Desirable Properties of Transactions
 Transactions should possess several properties, often called the ACID
properties
A Atomicity: a transaction is an atomic unit of processing and it is either
performed entirely or not at all.
C Consistency Preservation: a transaction should be consistency preserving
that is it must take the database from one consistent state to another.
I Isolation/Independence: A transaction should appear as though it is being
executed in isolation from other transactions, even though many
transactions are executed concurrently.
D Durability (or Permanency): if a transaction changes the database and is
committed, the changes must never be lost because of any failure.

 The atomicity property requires that we execute a transaction to
completion. It is the responsibility of the transaction recovery
subsystem of a DBMS to ensure atomicity.
 The preservation of consistency is generally considered to be the
responsibility of the programmers who write the database programs
or of the DBMS module that enforces integrity constraints.
 The isolation property is enforced by the concurrency control
subsystem of the DBMS. If every transaction does not make its
updates (write operations) visible to other transactions until it is
committed, one form of isolation is enforced that solves the
temporary update problem and eliminates cascading rollbacks
 Durability is the responsibility of recovery subsystem.

Schedule
 schedule (or history): the order of execution of operations from all the
transactions
 Example:
Sa: r1(X); r2(X);w1(X); r1(Y); w2(X); w1(Y)

Conflicting operations in a schedule
 Two operations in a schedule are said to conflict if they satisfy all three of
the following conditions:
(1) they belong to different transactions
(2) they access the same item X and
(3) at least one of the operations is a write_item(X)
 Conflicting operations:
• r1(X) conflicts with w2(X)
• r2(X) conflicts with w1(X)
• w1(X) conflicts with w2(X) Write conflict
• r1(X) do not conflicts with r2(X)
Read write conflict

Characterizing Schedules Based on Serializability
 schedules that are always considered to be correct when concurrent
transactions are executing are known as serializable schedules
 Suppose that two users—for example, two airline reservations
agents—submit to the DBMS transactions T1 and T2 at approximately
the same time. If no interleaving of operations is permitted, there are
only two possible outcomes:
1.Execute all the operations of transaction T1 (in sequence) followed
by all the operations of transaction T2 (in sequence).
2. Execute all the operations of transaction T2 (in sequence) followed
by all the operations of transaction T1 (in sequence).

Transactions and concurrency control mechanisms in database management system

 Serial schedule:
– A schedule S is serial if, for every transaction T participating in the
schedule, all the operations of T are executed consecutively in the
schedule.
• Otherwise, the schedule is called nonserial schedule.
 Serial schedules limits the concurrency by prohibiting interleaving of
operations
 Serial schedules are unacceptable in practice
 Determine which other schedules are equivalent to a serial schedule
 Example: X=90, Y=90 , N=3 and M=2
expected output: X=89 and Y=93

 Serializable schedule:
– A schedule S is serializable if it is equivalent to some serial schedule
of the same n transactions.
 Being serializable implies that the schedule is a correct schedule.
– It will leave the database in a consistent state.
– The interleaving is appropriate and will result in a state as if the
transactions were serially executed, yet will achieve efficiency due to
concurrent execution.

Testing conflict serializability of a Schedule S
1. For each transaction Ti participating in schedule S,create a node labeled
Ti in the precedence graph.
2. For each case in S where Tj executes a read_item(X) after Ti executes a
write_item(X), create an edge (TiTj) in the precedence graph.
3. For each case in S where Tj executes a write_item(X) after Ti executes a
read_item (X) ,create an edge (TiTj) in the precedence graph.
4. For each case in S where Tj executes a write_item(X) after Ti executes a
write_item(X), create an edge (TiTj) in the precedence graph.
5. The schedule S is serializable if and only if the precedence graph has no
cycles.

Chapter 17-32
Constructing the precedence graphs for schedules A and D from to test for
conflict serializability.
(a) Precedence graph for serial schedule A.
(b) Precedence graph for serial schedule B.
(c) Precedence graph for schedule C (not serializable).
(d) Precedence graph for schedule D (serializable, equivalent to schedule
A).

Chapter 17-33
Another example of serializability testing. (a) The READ and
WRITE operations of three transactions T1, T2, and T3.

Chapter 17-34
(continued)
Another example of serializability testing. (b) Schedule E.

Chapter 17-35
(continued)
Another example of serializability testing. (c) Schedule F.

Precedence graph for schedule E
Chapter 17-36

Precedence graph for schedule F
Chapter 17-37

• Purpose of Concurrency Control
– To enforce Isolation (through mutual exclusion) among conflicting
transactions.
– To preserve database consistency through consistency preserving
execution of transactions.
– To resolve read-write and write-write conflicts.
• Example:
– In concurrent execution environment if T1 conflicts with T2 over a
data item A, then the existing concurrency control decides if T1 or
T2 should get the A and if the other transaction is rolled-back or
waits.

Two-Phase Locking Techniques for Concurrency Control
• The concept of locking data items is one of the main techniques used
for controlling the concurrent execution of transactions.
• A lock is a variable associated with a data item in the database.
Generally there is a lock for each data item in the database.
• A lock describes the status of the data item with respect to possible
operations that can be applied to that item.
• It is used for synchronizing the access by concurrent transactions to
the database items.
• A transaction locks an object before using it
• When an object is locked by another transaction, the requesting
transaction must wait

Types of Locks and System Lock Tables
1.Binary Locks
 A binary lock can have two states or values: locked and unlocked (or 1
and 0).
 If the value of the lock on X is 1, item X cannot be accessed by a
database operation that requests the item
 If the value of the lock on X is 0, the item can be accessed when
requested, and the lock value is changed to 1
 We refer to the current value (or state) of the lock associated with
item X as lock(X).

 Two operations, lock_item and unlock_item, are used with binary
locking.
 A transaction requests access to an item X by first issuing a lock_item(X)
operation
 If LOCK(X) = 1, the transaction is forced to wait.
 If LOCK(X) = 0, it is set to 1 (the transaction locks the item) and the
transaction is allowed to access item X
 When the transaction is through using the item, it issues an
unlock_item(X) operation, which sets LOCK(X) back to 0 (unlocks the
item) so that X may be accessed by other transactions
 Hence, a binary lock enforces mutual exclusion on the data item.

lock_item(X):
B: if LOCK(X) = 0 (* item is unlocked *)
then LOCK(X) ←1 (* lock the item *)
else
begin
wait (until LOCK(X) = 0
and the lock manager wakes up the transaction);
go to B
end;
unlock_item(X):
LOCK(X) ← 0; (* unlock the item *)
if any transactions are waiting
then wakeup one of the waiting transactions;

 The lock_item and unlock_item operations must be implemented
as indivisible units that is, no interleaving should be allowed once a
lock or unlock operation is started until the operation terminates or
the transaction waits
 The wait command within the lock_item(X) operation is usually
implemented by putting the transaction in a waiting queue for item
X until X is unlocked and the transaction can be granted access to it
 Other transactions that also want to access X are placed in the
same queue.Hence, the wait command is considered to be outside
the lock_item operation.

 It is quite simple to implement a binary lock; all that is needed is a binary-
valued variable, LOCK, associated with each data item X in the database
 In its simplest form, each lock can be a record with three fields:
<Data_item_name, LOCK, Locking_transaction> plus a queue for
transactions that are waiting to access the item
 If the simple binary locking scheme described here is used, every
transaction must obey the following rules:
1. A transaction T must issue the operation lock_item(X) before any
read_item(X) or write_item(X) operations are performed in T.
2. A transaction T must issue the operation unlock_item(X) after all
read_item(X) and write_item(X) operations are completed in T.
3. A transaction T will not issue a lock_item(X) operation if it already
holds the lock on item X.
4. A transaction T will not issue an unlock_item(X) operation unless it
already holds the lock on item X.

2.Shared/Exclusive (or Read/Write) Locks
 binary locking scheme is too restrictive for database items because at
most, one transaction can hold a lock on a given item
 should allow several transactions to access the same item X if they all
access X for reading purposes only
 if a transaction is to write an item X, it must have exclusive access to X
 For this purpose, a different type of lock called a multiple-mode lock is
used
 In this scheme—called shared/exclusive or read/write locks—there are
three locking operations: read_lock(X), write_lock(X), and unlock(X).

 A read-locked item is also called shared-locked because other transactions
are allowed to read the item, whereas a write-locked item is called
exclusive-locked because a single transaction exclusively holds the lock on
the item
 Method to implement read/write lock is to
- keep track of the number of transactions that hold a shared (read) lock
on an item in the lock table
- Each record in the lock table will have four fields:
<Data_item_name, LOCK, No_of_reads, Locking_transaction(s)>.
 If LOCK(X)=write-locked, the value of locking_transaction(s) is a single
transaction that holds the exclusive (write) lock on X
 If LOCK(X)=read-locked, the value of locking transaction(s) is a list of one or
more transactions that hold the shared (read) lock on X.

 When we use the shared/exclusive locking scheme, the system must
enforce the following rules:
1. A transaction T must issue the operation read_lock(X) or write_lock(X)
before any read_item(X) operation is performed in T.
2. A transaction T must issue the operation write_lock(X) before any
write_item(X) operation is performed in T.
3. A transaction T must issue the operation unlock(X) after all
read_item(X) and write_item(X) operations are completed in T.
4. A transaction T will not issue a read_lock(X) operation if it already
holds a read (shared) lock or a write (exclusive) lock on item X.

Conversion of Locks
 A transaction that already holds a lock on item X is allowed under
certain conditions to convert the lock from one locked state to another
 For example, it is possible for a transaction T to issue a read_lock(X) and
then later to upgrade the lock by issuing a write_lock(X) operation
- If T is the only transaction holding a read lock on X at the time it
issues the write_lock(X) operation, the lock can be upgraded;
otherwise, the transaction must wait

Guaranteeing Serializability by Two-Phase Locking
 A transaction is said to follow the two-phase locking protocol if all
locking operations (read_lock, write_lock) precede the first unlock
operation in the transaction
 Such a transaction can be divided into two phases:
1. Expanding or growing (first) phase, during which new locks on
items can be acquired but none can be released
2. Shrinking (second) phase, during which existing locks can be
released but no new locks can be acquired
 If lock conversion is allowed, then upgrading of locks (from read-
locked to write-locked) must be done during the expanding phase,
and downgrading of locks (from write-locked to read-locked) must be
done in the shrinking phase.

 Transactions T1 and T2 in (a) do not follow the two-phase locking
protocol because the write_lock(X) operation follows the unlock(Y)
operation in T1, and similarly the write_lock(Y) operation follows the
unlock(X) operation in T2
 (b) Results of possible serial schedules of T1 and T2
 (c) A nonserializable schedule S that uses locks

 If we enforce two-phase locking, the transactions can be rewritten as T1’ and
T2’ as shown below
 If every transaction in a schedule follows the two-phase locking protocol,
schedule guaranteed to be serializable
 Two-phase locking may limit the amount of concurrency that can occur in a
schedule
 Some serializable schedules will be prohibited by two-phase locking protocol

Variations of Two-Phase Locking
 Basic 2PL
• Technique described previously
 Conservative (static) 2PL
• Requires a transaction to lock all the items it accesses before the
transaction begins execution by predeclaring read-set and write-set
• Its Deadlock-free protocol
 Strict 2PL
• guarantees strict schedules
• Transaction does not release exclusive locks until after it commits or
aborts
• no other transaction can read or write an item that is written by T
unless T has committed, leading to a strict schedule for recoverability
• Strict 2PL is not deadlock-free

 Rigorous 2PL
• guarantees strict schedules
• Transaction does not release any locks until after it commits or aborts
• easier to implement than strict 2PL

Transactions and concurrency control mechanisms in database management system

More Related Content

Similar to Transactions and concurrency control mechanisms in database management system (20)

More from ambikavenkatesh2 (19)

Recently uploaded (20)

Transactions and concurrency control mechanisms in database management system