Chapter-1 Transaction processing
Chapter-1 Transaction processing
Outline
Introduction to Transaction Processing
Transaction and System Concepts
Desirable Properties of Transactions
Characterizing Schedules based on Recoverability
Characterizing Schedules based on Serializability
1
Introduction
Single user Vs multiuser systems
One criterion for classifying a database system is
according to the number of users who can use the
system at the same time
Single-User System:
A DBMS is a single user if at most one user at a time can use the
system.
Multiuser System:
Many users can access the system concurrently.
Concurrency
Interleaved processing:
Concurrent execution of processes is interleaved in a single
2
Introduction (cont…)
A Transaction:
Logical unit of database processing that includes one or more
access operations (read, retrieval, write, insert or update and
delete)
A transaction (set of operations) may be stand-alone
specified in a high level language like SQL submitted
interactively, or may be embedded within a program.
Examples include ATM transactions, credit card approvals, flight
reservations, hotel check-in, phone calls, supermarket scanning,
academic registration and billing.
Transaction boundaries:
One way of specifying transaction boundaries is using explicit
Begin and End transaction statements in an application
program
An application program may contain several transactions
separated by the Begin and End transaction boundaries
3
Introduction (cont…)
4
Introduction (cont…)
Read and write operations:
Basic unit of data transfer from the disk to the computer
main memory is one block.
In general, a data item (what is read or written) will be
the field of some record in the database, although it
may be a larger unit such as a record or even a whole
block
read_item(X) command includes the following steps:
5
Introduction (cont…)
Read and Write Operations (cont.):
write_item(X) command includes the following steps:
6
Introduction (cont…)
Example of transactions
(a) Transaction T1
(b) Transaction T2
7
Introduction (cont…)
Transactions submitted by the various users may execute
concurrently and may access and update the same
database items
If this concurrent execution is uncontrolled, it may lead to
problems such as inconsistent database
Why Concurrency Control is needed:
Concurrency control is needed to respond to the effect of
the following problems on database consistency
The Lost Update Problem
This occurs when two transactions that access the same
database items have their operations interleaved in a
way that makes the value of some database item
incorrect since the update made by the first transaction
is not used by the second transaction.
In other words, the update made by the fist transaction
is lost(overwritten) by the second transaction
8
Introduction (cont…)
The Temporary Update (Dirty Read) Problem
This occurs when one transaction updates a database
item and then the transaction fails for some reason.
The updated item is accessed by another transaction
before it is changed back to its original value.
The Incorrect Summary Problem
If one transaction is calculating an aggregate
summary function on a number of records while
other transactions are updating some of these
records, the aggregate function may calculate some
values before they are updated and others after they
are updated.
9
Concurrent execution is uncontrolled:
(a) The lost update problem
10
E.g. Account with balance A=100.
T1 reads the account A
T1 withdraws 10 from A
T1 makes the update in the Database
T2 reads the account A
T2 adds 100 on A
T2 makes the update in the Database
In the above case, if done one after the other (serially) then we have no problem.
If the execution is T1 followed by T2 then A=190
If the execution is T2 followed by T1 then A=190
But if they start at the same time in the following sequence:
T1 T2
T1 reads the account A=100
Read_item(A)
T1 withdraws 10 making the balance A=90 A=A-10
T2 reads the account A=100 Read_item(A)
A=A+100
T2 adds 100 making A=200
Write_item(A)
T1 makes the update in the Database A=90 Write_item(A)
T2 makes the update in the Database A=200
After the successful completion of the operation the final value of A will be 200
which override the update made by the first transaction that changed the value from
100 to 90. 11
11
Lost Update problem: solution
Lost update!!
This could have been avoided if we prevent T2 from reading
until T1’s update has been completed
12
Concurrent execution is uncontrolled:
(b) The temporary update problem.
13
Example: T2 increases 100 making it 200 but then aborts the transaction
before it is committed. T1 gets 200, subtracts 10 and make it 190. But the
actual balance should be 90
T1 T2
Read_item(A)
A=A+100
Transaction T2 fails and must
Write_item(A) change the values of A back to its
old value; Meanwhile T1 has read
the temporary incorrect value of A
Read_item(A)
A=A-10
Write_item(A)
Abort
14
The temporary update problem: Example
Time T1 T2 bal(X)
t1 Begin Tx 100
t2 R(balX) 100
t3 balx=balx+100 100
t4 Begin Tx W(balx) 200
t5 R(balX) 200
t6 balx=balx-10 Rollback 200
t7 W(balx) 190
t8 Commit 190
Temporary update!!
Could have been avoided if we prevent T1 from reading until after
the decision to commit or rollback T2 has been made
15
Concurrent execution is uncontrolled:
(c) The incorrect summary problem.
16
The incorrect summary problem: Example
Time T5 T6 Bal(x) Bal(z) Sum
t1 Begin Tx 100 25 0
t2 Begin Tx Sum=0 100 25 0
t3 R(balX) 100 25 0
t4 balx=balx-10 R(balX) 100 25 0
t5 Sum+=balx 100 25 100
W(balx)
t6 R(balZ) 90 25 100
t7 balz=balz+10 90 25 100
t8 W(balz) 90 35 100
t9 Commit R(balz) 90 35 100
t10 Sum+=balz 90 35 135
t11 W(sum) 90 35 135
t12 commit 90 35 135
The incorrect summary problem:
•Example 2: T1 would like to add the values of A=10, B=20 and C=30. after
the values are read by T1 and before its completion, T2 updates the value
of B to be 50. at the end of the execution of the two transactions T1 will
come up with the sum of 60 while it should be 90 since B is updated to 50
T1 T2
Sum= 0;
Read_item(A)
Sum=Sum+A
Read_item(B)
Sum=Sum+B
Read_item(B)
B=50
Read_item(C)
Sum=Sum+C
18
What causes a Transaction to fail?
1. A computer failure (system crash):
A hardware or software error may occur in the
computer system during transaction execution. If
the hardware crashes, the contents of the
computer’s internal memory may be lost.
2. A transaction or system error:
Some operation in the transaction may cause it to
fail, such as integer overflow or division by zero.
Transaction failure may also occur because of
erroneous parameter values or because of a
logical programming error
19
What causes a Transaction to fail (Cont...)
3. Local errors or exception conditions detected by the
transaction:
Certain conditions necessitate cancellation of the
transaction
For example, data for the transaction may not
be found
A programmed abort in the transaction causes it to
fail.
4. Concurrency control enforcement:
The concurrency control method may decide to
abort the transaction, to be restarted later, because
it violates serializability or because several
transactions are in a state of deadlock
20
What causes a Transaction to fail (cont.):
5. Disk failure:
Some disk blocks may lose their data because of a
read or write malfunction or because of a disk
read/write head crash.
This may happen during a read or a write operation
of the transaction.
6. Physical problems and catastrophes:
This refers to an endless list of problems that
includes power or air-conditioning failure, fire, theft,
sabotage, overwriting disks or tapes by mistake,
and mounting of a wrong tape by the operator.
21
Transaction and System Concepts
Committed state
Failed state
Terminated State
22
State transition diagram illustrating the
states for transaction execution
23
Transaction and System Concepts (cont…)
Transaction operations
For recovery purposes, the system needs to keep track of when
the transaction starts, terminates, and commits or aborts
Recovery manager keeps track of the following operations:
begin_transaction: This marks the beginning of transaction
execution
read or write: These specify read or write operations on the
24
Transaction and System Concepts (cont…)
commit_transaction:
This signals a successful end of the transaction so that
25
Transaction and System Concepts (cont…)
The System Log
26
Transaction and System Concepts (cont…)
The System Log (cont):
started execution.
[write_item,T,X,old_value,new_value]: Records that
27
The System Log (cont):
[read_item,T,X]: Records that transaction T has
read the value of database item X.
[commit,T]: Records that transaction T has
completed successfully, and affirms that its effect
can be committed (recorded permanently) to the
database.
[abort,T]: Records that transaction T has been
aborted.
28
Recovery using log records:
If the system crashes, we can recover to a consistent
database state by examining the log record and using
recovery methods.
1. Because the log contains a record of every write
operation that changes the value of some database
item, it is possible to undo the effect of these write
operations of a transaction T by tracing backward
through the log and resetting all items changed by a
write operation of T to their old_values.
2. We can also redo the effect of the write operations of
a transaction T by tracing forward through the log and
setting all items changed by a write operation of T
(that did not get done permanently) to their
new_values.
29
Transaction and System Concepts (cont…)
Commit Point of a Transaction:
Definition a Commit Point:
30
Transaction and System Concepts (cont…)
Undoing transactions
If a system failure occurs, we search back in the log for
31
Transaction and System Concepts (cont…)
32
Desirable Properties of Transactions
Transaction should posses several properties. They are
often called the ACID properties and should be enforced by
the concurrency control and recovery methods of the DBMS.
ACID properties:
Atomicity: A transaction is an atomic unit of processing; it is
either performed in its entirety or not performed at all.
Consistency preservation: A correct execution of the
transaction must take the database from one consistent
state to another.
Isolation: A transaction should not make its updates visible
to other transactions until it is committed; this property, when
enforced strictly, solves the temporary update problem and
makes cascading rollbacks of transactions unnecessary
Durability or permanency: Once a transaction changes the
database and the changes are committed, these changes
must never be lost because of subsequent failure.
33
Example:
Suppose that Ti is a transaction that transfer 200 birr from account
CA2090( which is 5,000 Birr) to SB2359(which is 3,500 birr) as follows
Read(CA2090)
CA2090= CA2090-200
Write(CA2090)
Read(SB2359)
SB2359= SB2359+200
Write(SB2359)
Atomicity- either all or none of the above operation will be done – this is
materialized by transaction management component of DBMS
Consistency-the sum of CA2090 and SB2359 be unchanged by the
execution of Ti i.e 8500- this is the responsibility of application
programmer who codes the transaction
Isolation- when several transaction are being processed concurrently
on a data item they may create many inconsistent problems. So
handling such case is the responsibility of Concurrency control
component of the DBMS
Durability - once Ti writes its update this will remain there when the
database restarted from failure . This is the responsibility of recovery
management components of the DBMS 34
34
Schedules
Schedule (or history) of transaction
When transactions are executing concurrently in an interleaved
35
Schedules (cont…)
A shorthand notation for describing a schedule uses the
symbols :
r : for read_item operations ,
w: write_item,
c: commit and
a: abort
Transaction numbers are appended as subscript to each
operation in the schedule
The database item X that is read or written follows the r
and w operations in parenthesis
Example:
Sa: r1(X),r2(x),w1(x), r1(Y),w2(x);w1(Y)
Sb: r1(X),w1(x),r2(x), w2(x), r1(Y),a1
36
Conflicting operations
Two operations in a schedule are said to conflict if they
satisfy all three of the following conditions:
They belong to different transactions
37
Non conflicting operations
The operations r1(x) and r2(x) do not conflict since both of
them are read operations
r1(x) and w1(x) do not conflict because they belong to the
same transaction
W2(x) and w1(y) do not conflict since they operate on
distinct data items x and y
38
Complete schedules
A schedule S of n transactions T1, T2, ……..,Tn is
said to be a complete schedule if the following
conditions hold:
1. The operations in S are exactly those operations
in T1, T2, …Tn including a commit or abort
operations as the last operation for each
transaction in the schedule
2. For any pair of operations from the same
transaction Ti, their order of appearance in S is
the same as their order of appearance in T
3. For any two conflicting operations, one of the two
must occur before the other in the schedule
(theoretically, it is not necessary to determine an
order b/n pair of non conflicting operations)
39
Complete schedules (cont…)
Condition (3) above allows for two non conflicting
operations to occur in the schedule without defining
which occurs first leading to the definition of partial
order of the operations in n tractions
40
Complete schedules (cont…)
41
Next Class …
42
Characterizing Schedules based on
Recoverability
to rollback T
The schedules that theoretically meet this criterion are called
recoverable and those that do not are non recoverable
A schedule S is recoverable if no transaction T in S commits until
all transactions T’ that have written an item that T reads have
committed
A transaction T2 reads from Transaction T1 in a schedule S
if some item X is first written by T1 and latter read by T2
In addition, T1 should not have been aborted before T2
44
Recoverability (cont…)
For the above schedule to be recoverable, the c2
operation in Sc must be postponed until after T1
commits as shown in Sd
Sd:r1(x);w1(x);r2(x);r1(y);w2(x);w1(y);c1;c2
Recoverable
45
Recoverability (cont…)
If T1 aborts instead of committing, then T2 should also abort
as shown in Se because the X it read is no longer valid
Se:r1(x);w1(x);r2(x);r1(y);w2(x);w1(y);a1;a2 Recoverable
46
Cascadeless schedule:
One where every transaction reads only the items that are written by
committed transactions. Eg.
Sf: r1(X); w1(X); r1(Y); c1; r2(X); w2(X);w1(Y); c2;
Strict Schedules:
A schedule in which a transaction can neither read or write an item X
until the last transaction that wrote X has committed/aborted.
Eg. Sg: w1(X,5) ; c1; w2(x,8);
47
Characterizing Schedules based on Serializability
48
Characterizing Schedules based on
Serializability
Serial schedule:
A schedule S is serial if, for every transaction T
participating in the schedule, all the operations of
T are executed consecutively in the schedule
Otherwise, the schedule is called non serial
schedule.
Serializable schedule:
A schedule S is serializable if it is equivalent to
some serial schedule of the same n transactions
49
Characterizing Schedules based on
Serializability (cont….)
50
Characterizing Schedules based on
Serializability (cont…)
51
– The concept of Serializable of schedule is used to identify which
schedules are correct when concurrent transactions executions have
interleaving of their operations in the schedule
Serial schedule:
A schedule S is serial if, for every transaction T participating in the
schedule, all the operations of T are executed consecutively in the
schedule. Otherwise, the schedule is called nonserial schedule.
For example, in the banking example suppose there are two
transaction where one transaction calculate the interest on the
account and another deposit some money into the account. hence
the order of execution is important for the final result
Serializable schedule:
a schedule whose effect on any consistent database
instance is identical to that of some complete serial
schedule over the set of committed transactions in S.
A nonserial schedule S is serializable is equivalent to say that it is
correct to the result of one of the serial schedule .Example,
52
53
Characterizing Schedules based on
Serializability (cont….)
Result equivalent:
Two schedules are called result equivalent if they
Conflict equivalent:
Two schedules are said to be conflict equivalent if
54
cont..
Conflict serializable:
A schedule S is said to be conflict serializable if it is
55
Two schedules are said to be view equivalent if the
following three conditions hold:
1. The same set of transactions participates in S and
S’, and S and S’ include the same operations of
those transactions.
2. If Ti reads a value A written by Tj in S1 , it must also
read the value of A written by Tj in S2
3. for each data object A, the transaction that perform
the final write on x in S1 must also perform the final
write on A in S2
S’ S
T1: R(A) W(A) T1: R(A),W(A)
T2: W(A) view T2: W(A)
T3: W(A) T3: W(A)
56
Relationship between view and conflict equivalence:
57
Consider the following schedule of three transactions
T1: r1(X), w1(X); T2: w2(X); and T3: w3(X):
Schedule Sa: r1(X); w2(X); w1(X); w3(X); c1; c2; c3;
In Sa, the operations w2(X) and w3(X) are blind writes, since T1
and T3 do not read the value of X.
Sa is view serializable, since it is view equivalent to the
serial schedule T1, T2, T3.
However, Sa is not conflict serializable, since it is not conflict
equivalent to any serial schedule.
Testing for conflict serializability: Algorithm
– Looks at only read_Item (X) & write_Item (X) operations
– Constructs a precedence graph (serialization graph) - a graph
with directed edges
– An edge is created from Ti to Tj if one of the operations in Ti
appears before a conflicting operation in Tj
– The schedule is serializable if and only if the precedence
graph has no cycles. 58
Determining conflict serializability
To determine serializability, first identify the pair of
conflicting operations and check if their order is preserved in
one of the possible serial schedules
schedule A:
r1(x);w1(x),r1(y);w1(y);r2(x);w2(x)- serial schedule
schedule B:
r2(x);w2(x); r1(x);w1(x),r1(y);w1(y)- serial schedule
schedule C:
r1(x);r2(x);w1(x);w2(x),w1(y)- (not serializable).
ScheduleD :
r1(x);w1(x);r2(x);w2(x);r1(y);w1(y)-(serializable, equivalent to
schedule A).
59
Serializability (cont…)
Testing for conflict serializability with precedence graphs:
Algorithm
For each transaction Ti participating in Schedule S, create a node
has no cycles.
60
Testing serializability with Precedence Graphs
Serial
Serial
Not Serializable
Serializable
61
Transaction Support in SQL
A single SQL statement is always considered to be atomic.
Either the statement completes execution without error or it fails and
leaves the database unchanged.
Every transaction has three characteristics: Access mode, Diagnostic size
and isolation
i. Access mode:
READ ONLY or READ WRITE
If the access mode is Read ONLY , INSERT, DELET ,
UPDATE & CREATE commands cannot be executed on the
data base
The default is READ WRITE unless the isolation level of
READ UNCOMITTED is specified, in which case READ
ONLY is assumed.
ii. Diagnostic size n, specifies an integer value n, indicating the number
of error conditions that can be held simultaneously in the diagnostic
area.
iii. Isolation level can be
READ UNCOMMITTED,
READ COMMITTED,
REPEATABLE READ or
62
SERIALIZABLE. The default is SERIALIZABLE.
With SQL, there is no explicit Begin Transaction
statement.
Transaction initiation is done implicitly when
particular SQL statements are encountered.
Every transaction must have an explicit end
statement, which is either a COMMIT or
ROLLBACK.
63
Sample SQL transaction:
EXEC SQL whenever sqlerror go to UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
DIAGNOSTICS SIZE 5
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT
INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY)
VALUES ('Robert','Smith','991004321',2,35000);
EXEC SQL UPDATE EMPLOYEE
SET SALARY = SALARY * 1.1
WHERE DNO = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;
THE_END: ...
65
iii. Overwriting Uncommitted Data: WW Conflicts
• A transaction T2 could overwrite the value of an object A,
which has already been modified by a transaction T1,
while T1 is still in progress.
T1: W(A), W(B), C
T2: W(A), W(B), C
iv. Phantoms:
New rows being read using the same read with a condition.
A transaction T1 may read a set of rows from a table,
perhaps based on some condition specified in the SQL
WHERE clause.
Now suppose that a transaction T2 inserts a new row that
also satisfies the WHERE clause condition of T1, into the
table used by T1.
If T1 is repeated, then T1 will see a row that previously did
66
not exist, called a phantom.
Transaction Support in SQL
Possible violation of serializabilty:
Type of Violation
67
Summary
68
Thank You
69