0% found this document useful (0 votes)
58 views10 pages

Database Management Systems-15

The document discusses transaction management and serializability in databases. It defines what a transaction is and explains that transactions have read and write operations. It also discusses different forms of schedule equivalence, focusing on conflict serializability. Conflict serializability examines the four cases when two consecutive instructions I and J from different transactions access the same data item - whether the order matters depends on if they are both reads, a read and write, etc. Maintaining conflict serializability is important for ensuring transactions are isolated and changes are not lost or overwritten.

Uploaded by

Arun Sasidharan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views10 pages

Database Management Systems-15

The document discusses transaction management and serializability in databases. It defines what a transaction is and explains that transactions have read and write operations. It also discusses different forms of schedule equivalence, focusing on conflict serializability. Conflict serializability examines the four cases when two consecutive instructions I and J from different transactions access the same data item - whether the order matters depends on if they are both reads, a read and write, etc. Maintaining conflict serializability is important for ensuring transactions are isolated and changes are not lost or overwritten.

Uploaded by

Arun Sasidharan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Serializability – By M V Kamal

Before we can consider how the concurrency-control component of the database


system can ensure serializability, we consider how to determine when a schedule
is serializable. Certainly, serial schedules are serializable, but if steps of multiple
transactions are interleaved, it is harder to determine whether a schedule is
serializable. Since transactions are programs, it is difficult to determine exactly
what operations a transaction performs and how operations of various transactions
interact.

For this reason, we shall not consider the various types of operations that a
transaction can perform on a data item, but instead consider only two operations:
read and write. We assume that, between a read(Q) instruction and a write(Q)
instruction on a data item Q, a transaction may perform an arbitrary sequence of
operations on the copy of Q that is residing in the local buffer of the transaction.
In this model, the only significant operations of a transaction, from a scheduling
point of view, are its read and write instructions. Commit operations, though
relevant, are not considered until Section 14.7.We therefore may show only read
and write instructions in schedules, as we do for schedule 3 in Figure below.
In this section, we discuss different forms of schedule equivalence, but focus
on a particular form called conflict serializability.

Let us consider a schedule S in which there are two consecutive instructions, I and
J , of transactions Ti and Tj , respectively (i _= j). If I and J refer to different data
items, then we can swap I and J without affecting the results of any instruction
in the schedule. However, if I and J refer to the same data item Q, then the order of
the two steps may matter. Since we are dealing with only read and write
instructions, there are four cases that we need to consider:

1. I = read(Q), J = read(Q). The order of I and J does not matter, since the
same value of Q is read by Ti and Tj , regardless of the order.

2. I = read(Q), J = write(Q). If I comes before J , then Ti does not read the value
of Q that is written by Tj in instruction J . If J comes before I, then Ti reads
the value of Q that is written by Tj. Thus, the order of I and J matters.

3. I = write(Q), J = read(Q). The order of I and J matters for reasons similar to


those of the previous case.
4. I = write(Q), J = write(Q). Since both instructions are write operations, the
order of these instructions does not affect either Ti or Tj . However, the value
obtained by the next read(Q) instruction of S is affected, since the result of only
the latter of the two write instructions is preserved in the database. If there is no
other write(Q) instruction after I and J in S, then the order of I and J directly
affects the final value of Q in the database state that results from schedule S.

Fig: Schedule 3—showing only the read and write instructions.

We say that I and J conflict if they are operations by different transactions on the
same data item, and at least one of these instructions is a write operation. To
illustrate the concept of conflicting instructions, we consider schedule 3in Figure
above. The write(A) instruction of T1 conflicts with the read(A) instruction of T2.
However, the write(A) instruction of T2 does not conflict with the read(B)
instruction of T1, because the two instructions access different data items.
Transaction Characteristics
-Prepared by M V Kamal, Associate Professor, CSE Dept

Every transaction has three characteristics: access mode, diagnostics size, and isolation level. The
diagnostics size determines the number of error conditions that can be recorded.

If the access mode is READ ONLY, the transaction is not allowed to modify the database. Thus, INSERT,
DELETE, UPDATE, and CREATE commands cannot be executed. If we have to execute one of these
commands, the access mode should be set to READ WRITE. For transactions with READ ONLY access
mode, only shared locks need to be obtained, thereby increasing concurrency.

The isolation level controls the extent to which a given transaction is exposed to the actions of other
transactions executing concurrently. By choosing one of four possible isolation level settings, a user can
obtain greater concurrency at the cost of increasing the transaction's exposure to other transactions'
uncommitted changes.

Isolation level choices are READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, and
SERIALIZABLE. The effect of these levels is summarized in Figure given below. In this context, dirty read
and unrepeatable read are defined as usual. Phantom is defined to be the possibility that a transaction
retrieves a collection of objects (in SQL terms, a collection of tuples) twice and sees different results, even
though it does not modify any of these tuples itself. The highest degree of isolation from the effects of
other
Level Dirty Read Unrepeatable Read Phantom
READ UNCOMMITTED Maybe Maybe Maybe
READ COMMITTED No Maybe Maybe
REPEATABLE READ No No Maybe
SERIALIZABLE No No No
Figure: Transaction Isolation Levels in SQL-92

transactions is achieved by setting isolation level for a transaction T to SERIALIZABLE. This isolation
level ensures that T reads only the changes made by committed transactions, that no value read or
written by T is changed by any other transaction until T is complete, and that if T reads a set of values
based on some search condition, this set is not changed by other transactions until T is complete (i.e., T
avoids the phantom phenomenon).
In terms of a lock-based implementation, a SERIALIZABLE transaction obtains locks before reading or
writing objects, including locks on sets of objects that it requires to be unchanged (see Section 19.3.1),
and holds them until the end, according to Strict 2PL.

REPEATABLE READ ensures that T reads only the changes made by committed transactions, and that
no value read or written by T is changed by any other transaction until T is complete. However, T could
experience the phantom phenomenon; for example, while T examines all Sailors records with rating=1,
another transaction might add a new such Sailors record, which is missed by T.

A REPEATABLE READ transaction uses the same locking protocol as a SERIALIZABLE transaction,
except that it does not do index locking, that is, it locks only individual objects, not sets of objects.

READ COMMITTED ensures that T reads only the changes made by committed transactions, and that no
value written by T is changed by any other transaction until T is complete. However, a value read by T
may well be modified by another transaction while T is still in progress, and T is, of course, exposed to
the phantom problem.

A READ COMMITTED transaction obtains exclusive locks before writing objects and holds these locks
until the end. It also obtains shared locks before reading objects, but these locks are released
immediately; their only effect is to guarantee that the transaction that last modified the object is complete.
(This guarantee relies on the fact that every SQL transaction obtains exclusive locks before writing
objects and holds exclusive locks until the end.)

A READ UNCOMMITTED transaction T can read changes made to an object by an ongoing transaction;
obviously, the object can be changed further while T is in progress, and T is also vulnerable to the
phantom problem.

A READ UNCOMMITTED transaction does not obtain shared locks before reading objects. This mode
represents the greatest exposure to uncommitted changes of other transactions; so much so that SQL
prohibits such a transaction from making any changes itself - a READ UNCOMMITTED transaction is
required to have an access mode of READ ONLY. Since such a transaction obtains no locks for reading
objects, and it is not allowed to write objects (and therefore never requests exclusive locks), it never
makes any lock requests.

The SERIALIZABLE isolation level is generally the safest and is recommended for most transactions.
Some transactions, however, can run with a lower isolation level, and the smaller number of locks
requested can contribute to improved system performance.
For example, a statistical query that finds the average sailor age can be run at the READ COMMITTED
level, or even the READ UNCOMMITTED level, because a few incorrect or missing values will not
significantly affect the result if the number of sailors is large. The isolation level and access mode can be
set using the SET TRANSACTION command. For example, the following command declares the current
transaction to be SERIALIZABLE and READ ONLY:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY
When a transaction is started, the default is SERIALIZABLE and READ WRITE.
www.eazynotes.com Sabyasachi De Page No. 1

TRANSACTION MANAGEMENT

What is a Transaction?
A transaction is an event which occurs on the database. Generally a transaction reads a value from
the database or writes a value to the database. If you have any concept of Operating Systems, then
we can say that a transaction is analogous to processes.

Although a transaction can both read and write on the database, there are some fundamental
differences between these two classes of operations. A read operation does not change the image of
the database in any way. But a write operation, whether performed with the intention of inserting,
updating or deleting data from the database, changes the image of the database. That is, we may say
that these transactions bring the database from an image which existed before the transaction
occurred (called the Before Image or BFIM) to an image which exists after the transaction occurred
(called the After Image or AFIM).

The Four Properties of Transactions


Every transaction, for whatever purpose it is being used, has the following four properties. Taking
the initial letters of these four properties we collectively call them the ACID Properties. Here we try
to describe them and explain them.

Atomicity: This means that either all of the instructions within the transaction will be reflected in the
database, or none of them will be reflected.

Say for example, we have two accounts A and B, each containing Rs 1000/-. We now start a
transaction to deposit Rs 100/- from account A to Account B.
Read A;
A = A – 100;
Write A;
Read B;
B = B + 100;
Write B;
www.eazynotes.com Sabyasachi De Page No. 2

Fine, is not it? The transaction has 6 instructions to extract the amount from A and submit it to B.
The AFIM will show Rs 900/- in A and Rs 1100/- in B.

Now, suppose there is a power failure just after instruction 3 (Write A) has been complete. What
happens now? After the system recovers the AFIM will show Rs 900/- in A, but the same Rs 1000/-
in B. It would be said that Rs 100/- evaporated in thin air for the power failure. Clearly such a
situation is not acceptable.

The solution is to keep every value calculated by the instruction of the transaction not in any stable
storage (hard disc) but in a volatile storage (RAM), until the transaction completes its last
instruction. When we see that there has not been any error we do something known as a COMMIT
operation. Its job is to write every temporarily calculated value from the volatile storage on to the
stable storage. In this way, even if power fails at instruction 3, the post recovery image of the
database will show accounts A and B both containing Rs 1000/-, as if the failed transaction had never
occurred.

Consistency: If we execute a particular transaction in isolation or together with other transaction,


(i.e. presumably in a multi-programming environment), the transaction will yield the same expected
result.

To give better performance, every database management system supports the execution of multiple
transactions at the same time, using CPU Time Sharing. Concurrently executing transactions may
have to deal with the problem of sharable resources, i.e. resources that multiple transactions are
trying to read/write at the same time. For example, we may have a table or a record on which two
transaction are trying to read or write at the same time. Careful mechanisms are created in order to
prevent mismanagement of these sharable resources, so that there should not be any change in the
way a transaction performs. A transaction which deposits Rs 100/- to account A must deposit the
same amount whether it is acting alone or in conjunction with another transaction that may be trying
to deposit or withdraw some amount at the same time.

Isolation: In case multiple transactions are executing concurrently and trying to access a sharable
resource at the same time, the system should create an ordering in their execution so that they should
not create any anomaly in the value stored at the sharable resource.
www.eazynotes.com Sabyasachi De Page No. 3

There are several ways to achieve this and the most popular one is using some kind of locking
mechanism. Again, if you have the concept of Operating Systems, then you should remember the
semaphores, how it is used by a process to make a resource busy before starting to use it, and how it
is used to release the resource after the usage is over. Other processes intending to access that same
resource must wait during this time. Locking is almost similar. It states that a transaction must first
lock the data item that it wishes to access, and release the lock when the accessing is no longer
required. Once a transaction locks the data item, other transactions wishing to access the same data
item must wait until the lock is released.

Durability: It states that once a transaction has been complete the changes it has made should be
permanent.

As we have seen in the explanation of the Atomicity property, the transaction, if completes
successfully, is committed. Once the COMMIT is done, the changes which the transaction has made
to the database are immediately written into permanent storage. So, after the transaction has been
committed successfully, there is no question of any loss of information even if the power fails.
Committing a transaction guarantees that the AFIM has been reached.

There are several ways Atomicity and Durability can be implemented. One of them is called Shadow
Copy. In this scheme a database pointer is used to point to the BFIM of the database. During the
transaction, all the temporary changes are recorded into a Shadow Copy, which is an exact copy of
the original database plus the changes made by the transaction, which is the AFIM. Now, if the
transaction is required to COMMIT, then the database pointer is updated to point to the AFIM copy,
and the BFIM copy is discarded. On the other hand, if the transaction is not committed, then the
database pointer is not updated. It keeps pointing to the BFIM, and the AFIM is discarded. This is a
simple scheme, but takes a lot of memory space and time to implement.

If you study carefully, you can understand that Atomicity and Durability is essentially the same
thing, just as Consistency and Isolation is essentially the same thing.

Transaction States
There are the following six states in which a transaction may exist:
Active: The initial state when the transaction has just started execution.
www.eazynotes.com Sabyasachi De Page No. 4

Partially Committed: At any given point of time if the transaction is executing properly,
then it is going towards it COMMIT POINT. The values generated during the execution are
all stored in volatile storage.

Failed: If the transaction fails for some reason. The temporary values are no longer required,
and the transaction is set to ROLLBACK. It means that any change made to the database by
this transaction up to the point of the failure must be undone. If the failed transaction has
withdrawn Rs. 100/- from account A, then the ROLLBACK operation should add Rs 100/- to
account A.

Aborted: When the ROLLBACK operation is over, the database reaches the BFIM. The
transaction is now said to have been aborted.

Committed: If no failure occurs then the transaction reaches the COMMIT POINT. All the
temporary values are written to the stable storage and the transaction is said to have been
committed.

Terminated: Either committed or aborted, the transaction finally reaches this state.

The whole process can be described using the following diagram:

PARTIALLY COMMITTED
COMMITTED
Entry Point

ACTIVE
TERMINATED

FAILED ABORTED
www.eazynotes.com Sabyasachi De Page No. 5

Concurrent Execution
A schedule is a collection of many transactions which is implemented as a unit. Depending upon
how these transactions are arranged in within a schedule, a schedule can be of two types:
 Serial: The transactions are executed one after another, in a non-preemptive manner.
 Concurrent: The transactions are executed in a preemptive, time shared method.

In Serial schedule, there is no question of sharing a single data item among many transactions,
because not more than a single transaction is executing at any point of time. However, a serial
schedule is inefficient in the sense that the transactions suffer for having a longer waiting time and
response time, as well as low amount of resource utilization.

In concurrent schedule, CPU time is shared among two or more transactions in order to run them
concurrently. However, this creates the possibility that more than one transaction may need to access
a single data item for read/write purpose and the database could contain inconsistent value if such
accesses are not handled properly. Let us explain with the help of an example.

Let us consider there are two transactions T1 and T2, whose instruction sets are given as following.
T1 is the same as we have seen earlier, while T2 is a new transaction.

T1
Read A;
A = A – 100;
Write A;
Read B;
B = B + 100;
Write B;

T2
Read A;
Temp = A * 0.1;
Read C;
C = C + Temp;
Write C;

You might also like