0% found this document useful (0 votes)
8 views63 pages

Dbms Unit 5 Final

DBMS unit 5

Uploaded by

annapoovathinkal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views63 pages

Dbms Unit 5 Final

DBMS unit 5

Uploaded by

annapoovathinkal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 63

UNIT 5 – TRANSACTION PROCESSING

CONCEPTS

I) Transaction processing
II) Concurrency control Techniques
Transaction Processing

Introduction

Transaction and System concepts

Desirable properties of transaction

Schedules and recoverability

Serializability of schedules

Transaction support in SQL


INTRODUCTION
• The transaction is a set of logically related operation. It contains a
group of tasks.
• A transaction is an action or series of actions. It is performed by a
single user to perform operations for accessing the contents of the
database.
• A transaction includes one or more database operations like insertion,
deletion, modification or retrieval operations.
• Transactions boundaries
• begin transaction
• end transaction statements
Example :
Reservation systems, Credit card processing system, Insurance Processing.
• Granularity:
• size of the data item is called the granularity.
BLOCK
• The basic unit of data transfer from disk to main memory is one block.

• A transaction includes two basic database access operations.


1. read_item(X):

2. write_item(X):

• Read(A): Read operations Read(A) or R(A) reads the value of A from the database and stores it in
a buffer in main memory.

• Write (A): Write operation Write(A) or W(A) writes the value back to the database from buffer.
• 1. read_item(X):
• It reads a database item named X into a program variable. It works as follows,
• 1. The address of the disk block that contain item X is found.
• 2. The disk block is copied into a buffer in main memory.
• 3. The item X is copied from the buffer to the program variable named X.
• 2. write_item(X):
• It writes the value of the program variable X into the database item X. It works
as follows,
• 1. The address of the disk block that contains item X is found.
• 2. The disk block is copied into a buffer in main memory.
• 3. The item X is copied from the program variable named X into its correct
location in the buffer.
• 4. The updated block from the buffer is stored back to disk either immediately
or at sometime later.
INTRODUCTION
• The transaction is a set of logically related operation. It contains a
group of tasks.
• A transaction is an action or series of actions. It is performed by a
single user to perform operations for accessing the contents of the
database.
• Process:
• Single user
• Multi user
• Read/write
• Property of Transaction
• Atomicity
• Consistency
• Isolation
• Durability
States of Transaction
Schedule

• A series of operation from one transaction to another transaction is


known as schedule. It is used to preserve the order of the operation in
each of the individual transaction.
• Non serial schedule otherwise called as concurrent schedule.
1. Serial Schedule – NO INTERLEAVING

• The serial schedule is a type of schedule where one transaction is


executed completely before starting another transaction. In the serial
schedule, when the first transaction completes its cycle, then the next
transaction is executed.

• two transactions T1 and T2 which have some operations. If it has no


interleaving of operations.
• Ex:
• T1 executes all operation, then
• T2 executes all operation
2. Non-serial Schedule - INTERLEAVING

• If interleaving of operations is allowed, then there will be non-serial


schedule.
• It contains many possible orders in which the system can execute the
individual operations of the transactions.
• Figure c & d – interleaving operation –

• Ex:
• T1 executes but not complete the operation
• T2 starts to execute before T1 completes its operation.
3. Serializable schedule – non serial to serializable

• The serializability of schedules is used to find non-serial schedules


that allow the transaction to execute concurrently without interfering
with one another.
• It identifies which schedules are correct when executions of the
transaction have interleaving of their operations.
• A non-serial schedule will be serializable if its result is equal to the
result of its transactions executed serially.
• Schedule A and Schedule B are serial schedule.
• Schedule C and Schedule D are Non-serial schedule.
serializability
• Testing
• Conflict
• View
Testing of Serializability

• Serialization Graph is used to test the Serializability of a schedule.

• graph has a pair G = (V, E), where V consists a set of vertices, and E
consists a set of edges
• all edges Ti ->Tj , hold one of three conditions:
• Create a node Ti → Tj if Ti executes write (Q) before Tj executes read (Q).
• Create a node Ti → Tj if Ti executes read (Q) before Tj executes write (Q).
• Create a node Ti → Tj if Ti executes write (Q) before Tj executes write (Q).
Conflict Serializable Schedule

• A schedule is called conflict serializability if after swapping of non-


conflicting operations, it can transform into a serial schedule.
• The schedule will be a conflict serializable if it is conflict equivalent to
a serial schedule.
Conflicting Operations
The two operations become conflicting if all conditions satisfy:
1. Both belong to separate transactions.
2. They have the same data item.
3. They contain at least one write operation.
Example:
• Swapping is possible only if S1 and S2 are logically equal.

• Here, S1 = S2. That means


it is non-conflict.

• Here, S1 ≠ S2. That means


it is conflict.
Conflict Equivalent
• one can be transformed to another by swapping non-conflicting operations
• S2 is conflict equivalent to S1 (S1 can be converted to S2 by swapping non-
conflicting operations).
Two schedules are said to be conflict equivalent if and only if:
• They contain the same set of the transaction.
• If each pair of conflict operations are ordered in the same way.
T1 T2
Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)

Since, S1 is conflict serializable.


View Serializability

• A schedule will view serializable if it is view equivalent to a serial


schedule.
• If a schedule is conflict serializable, then it will be view serializable.
• The view serializable which does not conflict serializable contains
blind writes.
View Equivalent

• An initial read of both schedules must be the same.

• In schedule S1, if a transaction T1 is reading the data item A, then in S2,


transaction T1 should also read A.
Recoverability of Schedule

• Transaction may not execute completely due to a software issue,


system crash or hardware failure.
• In that case, the failed transaction has to be rollback.
Irrecoverable schedule:
• The schedule will be irrecoverable if Tj reads the updated value of Ti
and Tj committed before Ti commit.
Recoverable with cascading
rollback:
• The schedule will be recoverable with cascading rollback if Tj reads
the updated value of Ti. Commit of Tj is delayed till commit of Ti.
Transaction support in SQL ( refer 1st
unit)
• Properties
ACID
• Transaction Control
The following commands are used to control transactions.
• COMMIT − to save the changes.
• ROLLBACK − to roll back the changes.
• SAVEPOINT − creates points within the groups of transactions in which to ROLLBACK.
• SET TRANSACTION − Places a name on a transaction.

• Transactional Control Commands


DML Commands such as - INSERT, UPDATE and DELETE only.
Transaction Property
• DBMS is the management of data that should remain integrated when any changes are done in it.
It is because if the integrity of the data is affected, whole data will get disturbed and corrupted.
Therefore, to maintain the integrity of the data, there are four properties described in the database
management system, which are known as the ACID properties.
• Atomicity:
• As a transaction is set of logically related operations, either all of
them should be executed or none
• The term atomicity defines that the data remains atomic. It means if
any operation is performed on the data, either it should be performed
or executed completely or should not be executed at all.
• 2) Consistency:
• If operations of debit and credit transactions on same account are
executed concurrently, it may leave database in an inconsistent state.
• The word consistency means that the value should remain preserved
always. In DBMS, the integrity of the data should be maintained, which
means if a change in the database is made, it should remain preserved
always.
• Isolation:
• Result of a transaction should not be visible to others before
transaction is committed
• The term 'isolation' means separation. In DBMS, Isolation is the
property of a database where no data should affect the other one and
may occur concurrently. In short, the operation on one database
should begin when the operation on the first database gets complete.
It means if two operations are being performed on two different
databases, they may not affect the value of one another.
• Durability:
• Once database has committed a transaction, the changes made by the
transaction should be permanent.
• Durability ensures the permanency of something. In DBMS, the term
durability ensures that the data after the successful execution of the
operation becomes permanent in the database. The durability of the data
should be so perfect that even if the system fails or leads to a crash, the
database still survives.

• e.g.; If a person has credited $500000 to his account, bank can’t say that
the update has been lost. To avoid this problem, multiple copies of
database are stored at different locations.
UNIT 5 – PART 2
Concurrency control techniques
• Concurrency Control is the management procedure that is required
for controlling concurrent execution of the operations that take place
on a database.
• Concurrency Control in Database Management System is a procedure
of managing simultaneous operations without conflicting with each
other.
• It ensures that Database transactions are performed concurrently
and accurately to produce correct results without violating data
integrity of the respective Database.
Problems with Concurrent Execution

• Two main operation of database transactions are,


• Read

• Write

• there is a need to manage these two operations in the concurrent


execution of the transactions as if these operations are not performed in
an interleaved manner, and the data may become inconsistent. So, the
following problems occur with the Concurrent Execution of the operations:
Problem 1: Lost Update Problems (W - W Conflict)

• The problem occurs when two different database transactions


perform the read/write operations on the same database items in an
interleaved manner (i.e., concurrent execution) that makes the values
of the items incorrect hence making the database inconsistent.
• t time t1, transaction TX reads the value of account A, i.e., $300 (only read).

• At time t2, transaction TX deducts $50 from account A that becomes $250 (only deducted and not updated/write).

• Alternately, at time t3, transaction TY reads the value of account A that will be $300 only because TX didn't update the value yet.

• At time t4, transaction TY adds $100 to account A that becomes $400 (only added but not updated/write).

• At time t6, transaction TX writes the value of account A that will be updated as $250 only, as TY didn't update the value yet.

• Similarly, at time t7, transaction TY writes the values of account A, so it will write as done at time t4 that will be $400. It means

the value written by TX is lost, i.e., $250 is lost.


Dirty Read Problems (W-R Conflict)

• The dirty read problem occurs when one transaction updates an item
of the database, and somehow the transaction fails, and before the
data gets rollback, the updated database item is accessed by another
transaction. There comes the Read-Write Conflict between both
transactions.
• At time t1, transaction TX reads the value of account A, i.e., $300.
• At time t2, transaction TX adds $50 to account A that becomes $350.
• At time t3, transaction TX writes the updated value in account A, i.e., $350.
• Then at time t4, transaction TY reads account A that will be read as $350.
• Then at time t5, transaction TX rollbacks due to server problem, and the value changes back to
$300 (as initially).
• But the value for account A remains $350 for transaction TY as committed, which is the dirty read
and therefore known as the Dirty Read Problem.
Unrepeatable Read Problem (W-R Conflict)

• Also known as Inconsistent Retrievals Problem that occurs when in a


transaction, two different values are read for the same database item.
• At time t1, transaction TX reads the value from account A, i.e., $300.

• At time t2, transaction TY reads the value from account A, i.e., $300.

• At time t3, transaction TY updates the value of account A by adding $100 to the available balance, and then it becomes $400.

• At time t4, transaction TY writes the updated value, i.e., $400.

• After that, at time t5, transaction TX reads the available value of account A, and that will be read as $400.

• It means that within the same transaction T X, it reads two different values of account A, i.e., $ 300 initially, and after updation made

by transaction T , it reads $400. It is an unrepeatable read and is therefore known as the Unrepeatable read problem.
Potential problem
• Lost Updates occur when multiple transactions select the same row and update the row based on
the value selected

• Dirty read Uncommitted dependency issues occur when the second transaction selects a row
which is updated by another transaction

• Non-Repeatable Read occurs when a second transaction is trying to access the same row several
times and reads different data each time.

• Incorrect Summary issue occurs when one transaction takes summary over the value of all the
instances of a repeated data-item, and second transaction update few instances of that specific
data-item. In that situation, the resulting summary does not reflect a correct result.
Why use Concurrency method?

Reasons for using Concurrency control method is DBMS:

• To apply Isolation through mutual exclusion between conflicting transactions

• To resolve read-write and write-write conflict issues

• To preserve database consistency through constantly preserving execution obstructions

• The system needs to control the interaction among the concurrent transactions. This
control is achieved using concurrent-control schemes.

• Concurrency control helps to ensure serializability


Concurrency Control Protocols

• Lock-Based Protocols

• Two Phase Locking Protocol

• Timestamp-Based Protocols

• Validation-Based Protocols / Optimistic Based Protocol


Locking techniques for concurrency
control
There are several techniques to avoid the interference when the transactions are
executing concurrently.

Locking techniques for concurrency control:

• Concurrent execution of transactions can be controlled by locking the data items.

• A lock is a variable associated with a data item that describes the status of the item
with respect to the operations applied to it.

Two types of locks:

• 1.Binary Lock.
Binary Lock
• Binary lock values:
It can have only 2 states or values:
• Locked (1)

• Unlocked (0)

• If LOCK(x)=1,then x cannot be accessed by the database operation that requests the item.

• If LOCK(x)=0,then the item can be accessed when requested.

Binary Lock operations:

2 operations with binary lock:


• Lock_ item

• Unlock_ item
Shared/Exclusive Locks
Therefore LOCK(x) can have the above 3 possible states.
• Several transactions should be allowed to access
• The locking operations can be handled by maintaining lock table and
the item X if it is for reading purpose only. keeping track of the number of transactions holding a shared lock on

the item in the lock table.


• If a transaction has to do write operation then it
• Each record in the lock table will have 4 fields.
should be given exclusive access to x.
• 1. Data item name.

• There are 3 lock operations. • 2. LOCK.

• 3. Number of reads.

• 4. Locking transactions.
• 1.read_ lock(x)
• The state of the LOCK is write locked or read locked .
• 2.write_lock(x)
• No interleaving is allowed unless the transaction started terminates
• 3.Unlock(x).
by giving the lock or the transaction is placed on a waiting queue for

the item.
Two Phase Locking
• Growing phase: In the growing phase, a new lock on the data item
may be acquired by the transaction, but none can be released.

• Shrinking phase: In the shrinking phase, existing lock held by the


transaction may be released, but no new locks can be acquired.
• A deadlock is a condition where two or more transactions are waiting
indefinitely for one another to give up locks.
• Deadlock is said to be one of the most feared complications in DBMS
as no task ever gets finished and is in waiting state forever.
Advantages and Disadvantages of TO protocol:
• TO protocol ensures serializability since the precedence graph is as follows:

• TS protocol ensures freedom from deadlock that means no transaction ever waits.

• But the schedule may not be recoverable and may not even be cascade- free.
Optimistic concurrency control techniques / Validation protocol

• Validation phase is also known as optimistic concurrency control


technique.

In the validation based protocol, the transaction is executed in the


following three phases:
• Read phase

• Validation phase

• Write phase
• Read phase: In this phase, the transaction T is read and executed. It is used to read
the value of various data items and stores them in temporary local variables. It can
perform all the write operations on temporary variables without an update to the
actual database.

• Validation phase: In this phase, the temporary variable value will be validated
against the actual data to see if it violates the serializability.

• Write phase: If the validation of the transaction is validated, then the temporary
results are written to the database or system otherwise the transaction is rolled back.
Here each phase has the following different timestamps:

• Start(Ti): It contains the time when Ti started its execution.

• Validation (Ti): It contains the time when Ti finishes its read phase and starts its
validation phase.

• Finish(Ti): It contains the time when Ti finishes its write phase.


Deadlock and starvation

• In deadlocked, requested resources are blocked by the other


processes.

• In starvation, the requested resources are continuously used by high


priority processes. Avoiding mutual exclusion, hold and wait, and
circular wait and allowing preemption.
THANK YOU..
Thomas Write Rule
• Thomas Write Rule provides the guarantee of serializability order for
the protocol. It improves the Basic Timestamp Ordering Algorithm.
• The basic Thomas write rules are as follows:
• If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and
operation is rejected.
• If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the
transaction and continue processing.
• If neither condition 1 nor condition 2 occurs, then allowed to execute
the WRITE operation by transaction Ti and set W_TS(X) to TS(T).

You might also like