0% found this document useful (0 votes)
13 views52 pages

Trans

The document discusses transaction processing in database management systems, focusing on the basic concepts of transactions, their properties (ACID), and the importance of concurrency control. It outlines the states of transactions, the significance of serializability, and methods for testing it through precedence graphs. The document emphasizes the need for transactions to maintain database consistency while allowing for concurrent executions to improve performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views52 pages

Trans

The document discusses transaction processing in database management systems, focusing on the basic concepts of transactions, their properties (ACID), and the importance of concurrency control. It outlines the states of transactions, the significance of serializability, and methods for testing it through precedence graphs. The document emphasizes the need for transactions to maintain database consistency while allowing for concurrent executions to improve performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

DATABASE MANAGEMENT SYSTEM

CSC502

Transaction Processing
Transaction Processing
Outline
Basic Concept of Transaction

ACID Properties

Transaction State

Concurrent Executions

Serializability

Testing for Serializability.

Recoverability

Implementation of Isolation
Basic Concept of Transaction

A transaction is a unit of program execution that accesses and possibly


updates various data items.
Example of a transaction to transfer fund of $50 from account A to account
B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Two main issues to deal with the transactions:
➢ Failures of various kinds, such as hardware failures and system crashes
➢ Concurrent execution of multiple transactions
Required Properties of a Transaction
Consider a transaction to transfer $50 from account A to account B:

1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)

Atomicity requirement
➢ If the transaction fails (due to software or hardware) after step 3 and
before step 6, money will be “lost” leading to an inconsistent database
state

➢ The system should ensure that updates of a partially executed


transaction are not reflected in the database
Required Properties of a Transaction (Cont.)
Consider a transaction to transfer $50 from account A to account B:

1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)

Durability requirement

once the user has been notified that the transaction has completed
(i.e., the transfer of the $50 has taken place), the updates to the d
database by the transaction must persist even if there are software
or hardware failures.
Required Properties of a Transaction (Cont.)

Consistency requirement
➢ The sum of A and B is unchanged by the execution of the transaction

In general, consistency requirements include

➢ Explicitly specified integrity constraints such as primary keys and


foreign keys
➢ Implicit integrity constraints
Example: sum of balances of all accounts, minus sum of loan amounts
must equal value of cash-in-hand
A transaction, when starting to execute, must see a consistent database.

During transaction execution the database may be temporarily inconsistent.

When the transaction completes successfully the database must be


consistent
➢ Erroneous transaction logic can lead to inconsistency
Required Properties of a Transaction (Cont.)
Isolation requirement
if between steps 3 and 4 (of the fund transfer transaction), another transaction
T2 is allowed to access the partially updated database, it will see an
inconsistent database (the sum A + B will be less than it should be).

T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B

Isolation can be ensured trivially by running transactions serially. That is, one after
the other.
However, executing multiple transactions concurrently has significant benefits.
ACID Properties

A transaction is a unit of program execution that accesses and possibly updates


various data items. To preserve the integrity of data the database system must ensure
the ACID (Atomicity, Consistency, Isolation, Durability) properties.

Atomicity: Either all operations of the transaction are properly reflected in


the database or none are.

Consistency: Execution of a transaction in isolation preserves the


consistency of the database.
ACID Properties

A transaction is a unit of program execution that accesses and possibly updates


various data items. To preserve the integrity of data the database system must
ensure the ACID (Atomicity, Consistency, Isolation, Durability) properties.

Isolation: Although multiple transactions may execute concurrently, each


transaction must be unaware of other concurrently executing transactions.
Intermediate transaction results must be hidden from other concurrently
executed transactions.

➢ That is, for every pair of transactions Ti and Tj, it appears to Ti that either
Tj, finished execution before Ti started, or Tj started execution after Ti
finished.

Durability: After a transaction completes successfully, the changes it has


made to the database persist, even if there are system failures.
Transaction State

Active – the initial state; the transaction stays in this state while it is
executing

Partially committed – after the final statement has been executed.

Failed -- after the


discovery that normal
execution can no
longer proceed.
Transaction State
Aborted – after the transaction has been rolled back and the database
restored to its state prior to the start of the transaction. Two options after
it has been aborted:

➢ Restart the transaction (can be done only if no internal logical error)

➢ Kill the transaction

Committed – after successful completion.


Concurrent Executions
Multiple transactions are allowed to run concurrently in the system.
Advantages are:

➢ Increased processor and disk utilization, leading to better


transaction throughput

➢ Example: one transaction can be using the CPU while another is


reading from or writing to the disk
➢ Reduced average response time for transactions: short transactions
need not wait behind long ones.

Concurrency control schemes – mechanisms to achieve isolation

➢ That is, to control the interaction among the concurrent transactions in


order to prevent them from destroying the consistency of the
database
Schedules
Schedule – A schedule in DBMS is the order in which the operations
of multiple transactions appear for execution. A sequences of
instructions that specify the chronological order in which instructions of
concurrent transactions are executed

➢ A schedule for a set of transactions must consist of all instructions of


those transactions

➢ Must preserve the order in which the instructions appear in each


individual transaction.

A transaction that successfully completes its execution will have a


commit instructions as the last statement

➢ By default transaction assumed to execute commit instruction as its


last step

A transaction that fails to successfully complete its execution will have an


abort instruction as the last statement
Transactions T1 and T2
Let T1 transfer $50 from A to B, and
T2 transfer 10% of the balance from A to B.
Schedule 1
Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from
A to B.
An example of a serial schedule in which T1 is followed by T2 :
Schedule 2
A serial schedule in which T2 is followed by T1 :
Non Serial Schedule
Schedule 3 Let T1 and T2 be the transactions defined
previously. The following schedule (Schedule 3
T1 T2 Value
in the text) is not a serial schedule, but it is
Read(A) 100 equivalent to Schedule 1
A=A-50 50
In both Schedule 1 and 3, the sum A + B is
Write(A) 50
preserved
Read(A) 50
Temp=A*0. 5
1 A=100, B=100
A=A-temp 45 A B
50
Write(A) 45
Read(B) 100 45
B=B+50 150 150
155
Write(B) 150
Read(B) 150
B=B+temp 155 Sum (A,B)=200
Write(B) 155
Schedule 3
Let T1 and T2 be the transactions defined previously. The following
schedule is not a serial schedule, but it is equivalent to Schedule 1.

Note that in schedules 1, 2 and 3, the sum “A + B” is preserved.


Schedule 4
The following concurrent schedule does not preserve the sum of “A + B”
Serializability

Basic Assumption: Each transaction preserves database


consistency. So serial execution of a set of transactions
preserves database consistency.

A (possibly concurrent) schedule is serializable if it is


equivalent to some serial schedule.

Different forms of schedule equivalence:

1. Conflict serializability
2. View serializability

❑ Serializable schedule gives us the benefits of concurrent


execution without giving up correctness of data
Serializability

But what does “equivalent” mean?


If two schedules produce same results after execution, they are
equivalent. They may give same results for some value but
different value for different results. Hence, this result has least
significance.

S1 S2 Here, two schedules S1 and S2


Read(X) Read(X)
produce same result when
X=X+10 X=X*1.1 X=100. So, they are equivalent
Write(X) Write (X) for X=100

Let, X=200. In this case T1 and


T2 produce different results
(210, 220)
Simplified view of transactions

We ignore operations other than read and write instructions

We assume that transactions may perform arbitrary


computations on data in local buffers in between reads and
writes

Our simplified schedules consist of only read and write


instructions.
Conflict Equivalence
Conflicting Instructions
Let li and lj be two Instructions of transactions Ti and Tj respectively.

If li and lj refer to different data item, then we can swap li and lj


without affecting the result of any instruction in the schedule.

However, if li and lj refer to the same data item Q, then the order of
the two instructions may matter.

Since we are dealing only read and write instructions, so there are
four cases
1. li = read(Q), lj = read(Q).
2. li = read(Q), lj = write(Q).
3. li = write(Q), lj = read(Q).
4. li = write(Q), lj = write(Q).
Conflicting Instructions
Since we are dealing only read and write instructions, so
there are four cases

1. li = read(Q), lj = read(Q). li and lj don’t conflict.

2. li = read(Q), lj = write(Q). li and lj conflict.

3. li = write(Q), lj = read(Q). li and lj conflict

4. li = write(Q), lj = write(Q). li and lj conflict


Conflicting Instructions

1. li = read(Q), lj = read(Q). li and lj don’t conflict.


2. li = read(Q), lj = write(Q). li and lj conflict.
3. li = write(Q), lj = read(Q). li and lj conflict
4. li = write(Q), lj = write(Q). li and lj conflict

Instructions li (of transactions Ti) and lj (of transactions Tj) conflict


if and only if there exists some data item Q accessed by both li
and lj, and at least one of these instructions is a write Q.
Two instructions are non-conflicting if they are either operations
on different data item OR both them (instructions) are read
operations.
If li and lj are consecutive in a schedule and they do not conflict,
their results would remain the same even if they had been
interchanged in the schedule.
Conflicting Instructions
Conflict Serializability

A schedule S is said to be conflict equivalent to a


schedule S’, if S can be converted into S’ by performing
series of swaps of consecutive, non-conflictiong
instructions belonging to two different transactions.

We say that a schedule S is conflict serializable if it is


conflict equivalent to a serial schedule
Conflict Serializability
Definition of conflict-equivalence: Two schedules S1 and S2 on the
same transactions are conflict-equivalent if S1 can be transformed into
S2 through a sequence of swaps of consecutive non-conflicting
operations
Example:
S = r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B) [ non serial
schedule]
is conflict-equivalent to
S’ = r1(A) w1(A) r1(B) w1(B) r2(A) w2(A) r2(B) w2(B) [serial
schedule]
Because it can be transformed into S’ through the following sequence of
swaps:
i. r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B)
ii. r1(A) w1(A) r2(A) r1(B) w2(A) w1(B) r2(B) w2(B)
iii. r1(A) w1(A) r1(B) r2(A) w2(A) w1(B) r2(B) w2(B)
iv. r1(A) w1(A) r1(B) r2(A) w1(B) w2(A) r2(B) w2(B)
v. r1(A) w1(A) r1(B) w1(B) r2(A) w2(A) r2(B) w2(B)
( T1 followed by T2)
Conflict Serializability
Conflict Serializable

A schedule S is conflict serializable if it is conflict equivalent to


a serial schedule
Example of a schedule that is not conflict serializable:

We are unable to swap operations in the above schedule to


obtain either the serial schedule < T1, T2 >, or < T2, T1 >
Conflict Serializability (Cont.)
Schedule 3 can be transformed by a series of swaps of non-conflicting
instructions.

Schedule 3

Therefore, Schedule 3 is conflict serializable.


Conflict Serializability (Cont.)
Schedule 3 can be transformed into Schedule 6, by a series of swaps of
non-conflicting instructions.

Schedule 6 is a serial schedule where T2 follows T1.


Therefore, Schedule 3 is conflict serializable.
Schedule 3 Schedule 6
Testing for Conflict Serializability by
Precedence Graph
Consider some schedule with a set of transactions T1, T2, ..., Tn
Precedence graph — a direct graph G(V, E) called the
precedence graph from the schedule S.
Where the set of vertices (V) consists all of the transactions
(names) participating in the schedule and E is the set of edges.

An edge or arc from Ti to Tj if and only if in the schedule, for


any data item Q, one of the following condition hold:
1) Ti executes write(Q) before Tj executes read(Q)
2) Ti executes read(Q) before Tj executes write(Q)
3) Ti executes write(Q) before Tj executes write(Q)
Testing for Conflict Serializability

If an edge Ti Tj exists in the precedence graph, this


implies that in any serial schedule S’ equivalent to S, Ti must
appear before Tj

A schedule is conflict serializable if and only if its


precedence graph is acyclic.

This cycle-detection algorithms take order n2 time, where n


is the number of vertices in the graph.
(Better algorithms take order n + e where e is the number
of edges.)
Testing for Conflict Serializability
If precedence graph is acyclic, the
serializability order can be obtained
by a topological sorting of the graph.

➢ That is, a linear order consistent


with the partial order of the
graph.
➢ For example, a serializability
order for the schedule (a) would
be one of either (b) or (c)
Example: Testing for Conflict Serializability
Schedule 3 Create an edge from Ti to Tj if and only if in the schedule,
for any data item Q, one of the following condition hold:
T1 T2
read(A) 1) Ti executes write(Q) before Tj executes read(Q)
write(A) 2) Ti executes read(Q) before Tj executes write(Q)
read(A)
3) Ti executes write(Q) before Tj executes write(Q)
write(A)
read(B) T1 executes read(A) before T2 executes write(A)
write(B) So create a edge from T1 to T2
read(B)
write(B)

Precedence graph of this schedule 3


is acyclic, so schedule 3 is a T1 T2
conflict serializable.
Example: Testing for Conflict Serializability

Schedule 4 T1 executes read(A) before T2 executes write(A)


T1 T2 So create a edge from T1 to T2

read(A)
read(A)
write(A)
read(B) T1 T2
write(A)
read(B)
write(B)
write(B) read(B) of T2 executes before T1 executes
write(B). So edge from T2 to T1

Precedence graph of this schedule 4 contain a cycle, so


schedule 4 is not conflict serializable.
Example: Testing for Conflict Serializability

Schedule 9 Create an edge from Ti to Tj if and only if in the


schedule, for any data item Q, one of the following
T2 T3 T5 condition hold:
read(Q)
1) Ti executes write(Q) before Tj executes read(Q)
write(Q)
2) Ti executes read(Q) before Tj executes write(Q)
write(Q) 3) Ti executes write(Q) before Tj executes write(Q)

write(Q)

Precedence graph of this schedule 9


T2 T3
contain a cycle, so schedule 9 is
not conflict serializable.
But it is a view serializable (we
can show it later) T5
View Serializability
View Serializability
Let S and S’ be two schedules with the same set of transactions. S
and S’ are view equivalent if the following three conditions are met, for
each data item Q,

1. If in schedule S, transaction Ti reads the initial value of Q, then in


schedule S’ also transaction Ti must read the initial value of Q.

2. If in schedule S transaction Ti executes read(Q), and that value was


produced by transaction Tj (if any), then in schedule S’ also
transaction Ti must read the value of Q that was produced by the
same write(Q) operation of transaction Tj .

3. The transaction (if any) that performs the final write(Q) operation in
schedule S must also perform the final write(Q) operation in
schedule S’.

Note that, view equivalence is also based purely on reads and writes
alone.
View Serializability (Cont.)
A schedule S is view serializable if it is view equivalent to a serial schedule.

Every conflict serializable schedule is also view serializable, but not vice
versa, i.e. every view serializable is not conflict serializable schedule.

Below is a schedule which is view-serializable but not conflict


serializable.
Schedule 9
T2 T3 T5
read(Q)

What serial schedule is equivalent to? write(Q)


write(Q)
write(Q)

Every view serializable schedule that is not conflict serializable has blind
writes.
Test for View Serializability
The precedence graph test for conflict serializability cannot be used
directly to test for view serializability.

➢ Extension to test for view serializability has cost exponential in the size
of the precedence graph.

The problem of checking if a schedule is view serializable falls in the class


of NP-complete problems.

➢ Thus, existence of an efficient algorithm is extremely unlikely.


More Complex Notions of Serializability
The schedule below produces the same outcome as the serial schedule
< T1, T5 >, yet is not conflict equivalent or view equivalent to it.

If we start with A = 1000 and B = 2000,


the final result is 960 and 2040

Determining such equivalence requires analysis of operations other


than read and write.
Recoverable Schedules
Recoverable schedule — if a transaction Tj reads a data item previously
written by a transaction Ti , then the commit operation of Ti must appear
before the commit operation of Tj.

The following schedule is not recoverable if T9 commits immediately after


the read(A) operation.

If T8 should abort, T9 would have read (and possibly shown to the user) an
inconsistent database state. Hence, database must ensure that schedules
are recoverable.
Cascading Rollbacks
Cascading rollback – a single transaction failure leads to a series of
transaction rollbacks.
Consider the following schedule where none of the transactions has yet
committed (so the schedule is recoverable)

If T10 fails, T11 and T12 must


also be rolled back.

Can lead to the undoing of a significant amount of work


Cascadeless Schedules
Cascadeless schedules — for each pair of transactions Ti and Tj such
that Tj reads a data item previously written by Ti, the commit operation
of Ti appears before the read operation of Tj.

Every cascadeless schedule is also recoverable

It is desirable to restrict the schedules to those that are cascadeless

Example of a schedule that is NOT cascadeless


Concurrency Control
A database must provide a mechanism that will ensure that all possible
schedules are both:
➢ Conflict serializable.
➢ Recoverable and preferably cascadeless
A policy in which only one transaction can execute at a time generates
serial schedules, but provides a poor degree of concurrency

Concurrency-control schemes tradeoff between the amount of


concurrency they allow and the amount of overhead that they incur

Testing a schedule for serializability after it has executed is a little too late!
➢ Tests for serializability help us understand why a concurrency control
protocol is correct

Goal – to develop concurrency control protocols that will assure


serializability.
THANK YOU

You might also like