IT 220 Unit 6 Transaction Processing and Concurrency Control and Recovery Transaction Management
IT 220 Unit 6 Transaction Processing and Concurrency Control and Recovery Transaction Management
Transaction Processing,
Concurrency Control and
Recovery Techniques
Topics
• Introduction to Transaction Processing
• Transaction and System Concepts
• Desirable Properties of Transactions
• Serializable Schedule
• Two-Phase Locking
• Timestamp Ordering Concurrency Control Techniques
• Recovery Concepts
• NO-UNDO/REDO
• Recovery Based on Deferred Update
• Recovery Technique Based on Immediate Update
• Shadow Paging
• Database Backup and Recovery from Catastrophic Failures.
Transaction
Set of operation which works on consistence database and return
consistence database.
reads value from the database or writes value to the database.
Read operation (select) donot change the database but write
operation (insert, update, delete) change the database.
transactions bring the database from an image which existed before
the transaction occurred (called the Before Image or BFIM) to an
image which exists after the transaction occurred (called the After
Image or AFIM).
Transaction Example
Suppose account X have : Rs 50000
Transaction T1 occur (withdraw money (Rs. 1000 from x)
Read (X)
X:=X-1000;
Write (X)
After transaction t1, X have Rs. 49000
Introduction to Transaction
Processing
• Transaction processing refers to the process of
completing a task or a set of tasks in a specific order,
ensuring that all steps are executed correctly and
without errors.
• In computing, transaction processing is the technique
of grouping a set of database operations into a single
unit of work, known as a transaction, to ensure data
integrity and consistency in the event of a failure or
error.
• This can include database operations, financial
transactions, and other types of data processing.
Transaction and System Concepts
• Transaction and system concepts are fundamental elements of transaction
processing systems. These concepts are used to ensure the consistency,
integrity, and durability of data in the event of system failures or errors.
• A transaction is a unit of work that is executed in a database management
system (DBMS) and is typically composed of one or more database
operations, such as inserting, updating, or deleting data. Transactions are
used to ensure the consistency and integrity of data in a database by
ensuring that all operations within a transaction are completed successfully,
or that none of them are completed if an error occurs.
• System concepts, on the other hand, refer to the overall architecture and
design of a computer system, including the hardware, software, and
communication protocols used to manage and process data. These
concepts are important in the design and implementation of large-scale,
distributed systems such as databases, as they provide a framework for
ensuring the scalability, security, and reliability
Desirable Properties of Transactions
• The desirable properties of transactions are a set of
characteristics that a transaction processing system should have
in order to ensure the consistency, integrity, and durability of
data.
• These properties are often referred to as the ACID properties,
• Atomicity
• Ensures that all operations within the transaction are completed, or none are.
• If a failure occurs, the entire transaction is rolled back to its previous state.
• Consistency
• Ensures that the database remains in a consistent state before and after the
transaction is executed.
• Isolation
• Ensures that the operations within a transaction are isolated from the operations of
other transactions.
• Durability
• Ensures that the changes made during a transaction are permanent and survive any subsequent failures.
Properties of Transaction (ACID)
Atomicity
Either all of the operation executed successfully
in the transaction or none of the them.
If all operation are executed then it is okay, if
any one operation failed to executed then roll
back to previous step.
Example
We have two account in bank, account ‘A’ and
account ‘B’. If A transfer 100 rupees to B,
following transaction will occur.
Before Transaction A=1000 Rs. Before Transaction B=1500
T1 T2
Read (A) Read (B)
A:=A-100 B=B+100;
Write (A) Write (B)
After transaction A=900 Rs. After transaction B=1600 Rs.
If the transaction fails after completion of T1 but before completion of T2.( say,
after write(A) but before write(B)), then amount has been deducted from A but
not added to B. This results in an inconsistent database state.
Consistency
Consistency refers to the correctness of the database.
Database should be consistence before and after execution of
transaction.
Execution of transaction in isolation preserve the database
consistency.
From the above example:
Before transaction: total amount is A+B (1000+1500=2500)
After transaction: total Amount is A+B (900+1600=2500)
It is the consistency of the database. Inconsistency occurs if T1
complete its transaction but T2 incomplete.
Isolation
• A property of a transaction that ensures that the operations within a transaction are
isolated from the operations of other transactions.
• Even though multiple transactions may execute concurrently, the system guarantees
that, for every pair of transactions Ti and Tj , it appears to Ti that either Tj finished
execution before Ti started or Tj started execution after Ti finished. Thus, each
transaction is unaware of other transactions executing concurrently in the system.
• If multiple transactions are executing concurrently and trying to access a sharable
resource at the same time, the system should create an ordering in their execution so
that they should not create any anomaly in the value stored at the sharable resource.
• A transaction should be isolated from other transactions, meaning
that the execution of one transaction should not affect the execution
of other transactions. This prevents contention and deadlocks and
ensures that the system is always in a consistent state.
Isolation Example
X=500 (T1) Y=500 (T2) T1= 500X100 +500-100=50400
Read (X) T2: 500X100 +500=50500
This results in database inconsistency,
X:=x*100
due to a loss of 100 Rupees. Hence,
Write(x) Read(x) transactions must take place in
Read (y) Read(y) isolation and changes should be
Y:=y-100 Z=(x+y) visible only after a they have been
Write (Y) Write (z) made to the main memory.
Z=(x+y)
Wirte (z)
SQL specifies 3 phenomena/situations that
occur if proper isolation is not maintained
Dirty Read: T1 Modifies X which is then read by T2 before T1 terminate;
it T1 aborts , T2 has read value which is not exist in Database.
Non-repeatable (fuzzy) Read: T1 read X; T2 then modifies or delete
value of x and commit, T1 tries to read X again, but read another value
or can’t find it.
Phantom: T2 searches the databases according to the predicate P while
P2 insert new tuples that satisfy P.
Durability
After a transaction completes successfully, the changes it has made to
the database persist, even if there are system failures.
Existing Lock
Shared Exclusive
Locks to be Shared TRUE FALSE
A = A – 100: Write A;
Lock-X (B); (Exclusive Lock, we want to both read B’s value and modify it)
Read B;
B = B + 100; Write B;
Read A;
Temp = A * 0.1;
Lock-X (C); (Exclusive Lock, we want to both read C’s value and modify it)
Read C;
C = C + Temp; Write C;
Read (A);
Temp=A*0.1;
Read C;
Write (A);
Read(B);
C=C+Temp;
B=B+100; Write C;
Write(B)
Wrong Schedule
Right Schedule
Two Phase Locking Protocol
2PL define the rules of how to acquire the locks on a data item and how
to release.
2PL assumes that transaction can only be in two phase:
Growing Phase: Acquire lock, but cannot release lock.
Shrinking Phase: Release lock but cannot acquire any new lock.
Two Phase Locking Protocol
The lock point is the moment when transitioning from the growing
phase to the shrinking phase
Two version of 2PL
Strict 2PL:In this protocol, a transaction may release all the shared
locks after the Lock Point has been reached, but it cannot release any
of the exclusive locks until the transaction commits. This protocol helps
in creating cascade less schedule.
A Cascading Schedule is a typical problem faced while creating
concurrent schedule. Consider the given schedule.
Strict 2PL
Using Strict Two Phase Locking Protocol, Cascading
Rollback can be prevented. In Strict Two Phase
Locking Protocol a transaction cannot release any of
its acquired exclusive locks until the transaction
commits.
Rigorous 2PL
In Rigorous Two Phase Locking Protocol, a transaction is not allowed to
release any lock (either shared or exclusive) until it commits. This
means that until the transaction commits, other transaction might
acquire a shared lock on a data item on which the uncommitted
transaction has a shared lock; but cannot acquire any lock on a data
item on which the uncommitted transaction has an exclusive lock.
Time Stamp Base Protocol
The main idea for this protocol is to order the transactions based on
their Timestamps.
• W-timestamp (Q): This means the latest time when the data item Q
has been written into.
• R-timestamp (Q): This means the latest time when the data item Q
has been read from.
• These two timestamps are updated each time a successful read/write
operation is performed on the data item Q.
How should timestamps be
used?
The timestamp ordering protocol ensures that any pair of conflicting
read/write operations will be executed in their respective timestamp order.
This is an alternative solution to using locks.
For Read operations:
1. If TS (T) < W-timestamp (Q), then the transaction T is trying to read a value of
data item Q which has already been overwritten by some other transaction.
Hence the value which T wanted to read from Q does not exist there anymore,
and T would be rolled back.
2. If TS (T) >= W-timestamp (Q), then the transaction T is trying to read a value
of data item Q which has been written and committed by some other
transaction earlier. Hence T will be allowed to read the value of Q, and the R-
timestamp of Q should be updated to TS (T).
For Write operations:
1.If TS (T) < R-timestamp (Q), then it means that the system has waited
too long for transaction T to write its value, and the delay has become
so great that it has allowed another transaction to read the old value of
data item Q. In such a case T has lost its relevance and will be rolled
back.
2.Else if TS (T) < W-timestamp (Q), then transaction T has delayed so
much that the system has allowed another transaction to write into
the data item Q. in such a case too, T has lost its relevance and will be
rolled back.
3.Otherwise the system executes transaction T and updates the W-
timestamp of Q to TS (T).
Database Recovery
• System error:
• The system has entered an undesirable state (e.g., deadlock), as a result of which a transaction cannot
continue with its normal execution. The transaction, however, can be re-executed at a later time.
System Crash
System Crash
• A system crash in a database refers to an unexpected shutdown or failure of the database
management system (DBMS) software or hardware.
• can occur due to a variety of reasons such as a hardware malfunction, a software bug, or a power
outage.
• A system crash can result in data loss or corruption, as well as the inability to access the database
until it is properly recovered.
• it is best practice to implement a robust backup and recovery strategy. This typically includes
regularly backing up the database and keeping multiple copies of the backups in different
locations.
Disk Failure
Disk Failure
• Disk failure in a database refers to the inability to access or read data from a hard disk or storage
device due to a physical or logical malfunction.
• This can occur due to a variety of reasons such as a mechanical failure, a power surge, or a
software error.
• Disk failure can result in data loss or corruption, as well as the inability to access the database
until the disk is replaced or repaired.
Recovery Concept
• Recovery from failure state refers to the method by which
system restore its most recent consistent state just before the
time of failure. Recovery in databases refers to the process of
restoring a database to a previous state after a failure.
• The main goal of recovery is to ensure the integrity and
consistency of the data and to minimize data loss and
downtime.
• There are several methods by which we can recover database
from failure state.
1. Log Based Recovery
2. Caching (Buffering) of Disk Blocks
3. Write-Ahead Logging
Log Based Recovery
• In log based recovery system,a log is maintained, in which all the modifications of the
database are kept.
• A log consist of log records.
• The log is a sequence of records. Log of each transaction is maintained in some stable
storage so that if any failure occurs, then it can be recovered from there.
• If any operation is performed on the database, then it will be recorded in the log.
• But the process of storing the logs should be done before the actual transaction is
applied in the database.
Log Based Recovery
• There are various log records. A typical update log record must contain following fields:
a) Transaction identifier:A unique number given to each transaction.
b) Data-item identifier:A unique number given to data item written.
c) Date and time of updates.
d) Old value:Value of data item before write.
e) New value:Value of data item after write.
• Logs must be written on the non-volatile storage.In log based recovery, the following two operations for
recovery are required:
a) Redo:It means, the work of the transactions that completed successfully before crash is to be performed
again
b) Undo:It means, all the work done by the transactions that did not complete due to crash is to be undone.
Log Based Recovery
[T1 start]
[T1,A]
[T1,A,800,700]
[T1,B]
[T1,B,600,700]
[T1 commit]
Log Based Recovery
• There are two types of log based recovery techniques and they are:
a) Recovery based on deferred update(NO-UNDO/REDO)
b) Recovery based on immediate update(UNDO/REDO)
Recovery based on deferred update(NO-UNDO/REDO)
• The deferred modification technique occurs if the transaction does not modify
the database until it has committed.
• In this method, all the logs are created and stored in the stable storage, and the
database is updated when a transaction commits.
• In this technique, the old value field is not needed.
• In this method once rollback is done all the records of log file are discarded and
no changes are applied to the database.
• It is used for the recovery of transaction failures that occur due to power,
memory, or OS failures.
Example
• To illustrate consider transaction T1 and T2. Suppose you want to transfer Rs. 200 from Account A
to B in Transaction T1 and deposit Rs.200 to Account C in T2.
T1 T2
read(A); read(C);
A:=A-200; C:=C+200;
write(A); write(C);
read(B);
B:=B+200;
write(B);
• Suppose. the initial values of A,B and C accounts are Rs. 500,Rs. 1000 and Rs.600 respectively, various log
records of T1 and T2 are show in below:
[T1 start]
[T1,A]
[T1,A,300]
[T1,B]
[T1,B,1200]
[T1 commit]
[T2 start]
[T2,C]
[T2,C,800]
[T2 commit]
Log of transaction T1 and T2 in case of
crash
• For a redo operation,log must contain [Ti start] and [Tj commit] log records.
[T1 start] [T1 start]
[T1,A] [T1,A]
[T1,A,300] [T1,A,300]
[T1,B]
[T1,B,1200]
[T1 commit]
[T2 start]
[T2,C]
[T2,C,800]
Recovery based on Immediate
update(UNDO/REDO)
• To illustrate consider transaction T1 and T2. Suppose you want to transfer Rs. 200 from Account A
to B in Transaction T1 and deposit Rs.200 to Account C in T2.
T1 T2
read(A); read(C);
A:=A-200; C:=C+200;
write(A); write(C);
read(B);
B:=B+200;
write(B);
• Suppose. the initial values of A,B and C accounts are Rs. 500,Rs. 1000 and Rs.600 respectively, various
log records after successful completion of T1 and T2 are show in below:
[T1 start]
[T1,A]
[T1,A,500,300]
[T1,B]
[T1,B,1000,1200]
[T1 commit]
[T2 start]
[T2,C]
[T2,C,600,800]
[T2 commit]
Log of transaction T1 and T2 in case of crash
• For a transaction Ti to be redone, log must contain both [Ti start] and [Ti commit] records. For a
transaction Ti to be undone,log must contain only [Ti start] record.