ADB Slides 6
ADB Slides 6
B B
CPU
A
CPU A 2
CPU
1
1
time
t1 t2 t1 t2
interleaved processing parallel processing
What is Transaction?
● A Transaction:
○ Logical unit of database processing that includes one or more
access operations (read -retrieval, write - insert or update, delete).
● A transaction (set of operations) may be stand-alone
specified in a high level language like SQL submitted
interactively, or may be embedded within a program.
● Transaction boundaries:
○ Begin and End transaction.
● An application program may contain several transactions
separated by the Begin and End transaction boundaries.
Why Do We Need Transactions?
● It’s all about fast query response time and
correctness
● DBMS is a multi-user systems
○ Many different requests
○ Some against same data items
● Figure out how to interleave requests to shorten
response time while guaranteeing correct result
○ How does DBMS know which actions belong together?
● Solution: Group database operations that must be
performed together into transactions
○ Either execute all operations or none
Simple Model of a Database
T1
t0 read_item(X);
read_item(Y);
X:=X-40;
Y:=Y+40;
write _item(X);
write_item(Y);
tk
Another Sample Transaction
● Reserving a seat for a flight
● If concurrent access to data in DBMS, two users may
try to book the same seat simultaneously
Agent 1 finds
seat 35G empty
time
Agent 2 finds
seat 35G empty
Agent 1 sets
seat 35G occupied
Agent 2 sets
seat 35G occupied
SQL Transaction Example
● Register credit sale of 100 units of product X to customer Y for $500
UPDATE PRODUCT
SET PROD_QOH = PROD_QOH - 100
WHERE PROD_CODE = ‘X’;
UPDATE ACCT_RECEIVABLE
SET ACCT_BALANCE = ACCT_BALANCE + 500
WHERE ACCT_NUM = ‘Y’;
COMMIT;
● Consistent state only if both transactions are fully completed
● DBMS doesn’t guarantee transaction represents real-world event
● Transaction begins when first SQL statement is encountered, and ends
at COMMIT or end
Two sample transactions
Why Concurrency Control is needed
● The Lost Update Problem
○ This occurs when two transactions that access the same database items have
their operations interleaved in a way that makes the value of some database item
incorrect.
● The Temporary Update (or Dirty Read) Problem
○ This occurs when one transaction updates a database item and then the
transaction fails for some reason.
○ The updated item is accessed by another transaction before it is changed back to
its original value.
● The Incorrect Summary Problem
○ If one transaction is calculating an aggregate summary function on a number of
records while other transactions are updating some of these records, the
aggregate function may calculate some values before they are updated and
others after they are updated.
(a) The lost update problem.
(b) The temporary update problem.
(c) The incorrect summary problem.
What Can Go Wrong?
● System may crash before data is written back to disk
● Some other transaction is modifying shared data
while our transaction is ongoing (or vice versa)
● System may not be able to obtain one or more of the
data items
● System may not be able to write one or more of the
data items
● DBMS has a Concurrency Control subsytem to
assure database remains in consistent state despite
concurrent execution of transactions
What causes a Transaction to fail?
1. A computer failure (system crash):
A hardware or software error occurs in the computer system during
transaction execution. If the hardware crashes, the contents of the
computer’s internal memory may be lost.
2. Disk failure:
Some disk blocks may lose their data because of a read or write
malfunction
or because of a disk read/write head crash. This may happen during a read
or a write operation of the transaction.
Back P and Next P point to the previous and next log records of the
same transaction.
How is the Log File Used?
● All permanent changes to data are recorded
○ Possible to undo changes to data
● After crash, search log backwards until find last
commit point
○ Know that beyond this point, effects of transaction are
permanently recorded
● Need to either redo or undo everything that
happened since last commit point
○ Undo: When transaction only partially completed (before
crash)
○ Redo: Transaction completed but we are unsure whether
data was written to disk
ACID Properties of Transactions
● Atomicity: Transaction is either performed in its
entirety or not performed at all
l Sample schedule:
S: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); c1; c2
Conflicts
conflicts
Why Do We Interleave Transactions?
Schedule S
T1 T2
read_item(X);
X:=X-N;
write_item(X);
read_item(Y);
Y:=Y+N;
write_item(Y);
Could be a long wait
read_item(X):
X:=X+M;
write_item(X);
S’ is a non-serial schedule
T2 will be done faster but is the result correct?
Concurrent Executions
S” is a non-serial schedule
Produces same result as serial schedule S
4 Serializability
● Serial schedule:
○ A schedule S is serial if, for every transaction T participating in
the schedule, all the operations of T are executed consecutively
in the schedule.
■ Otherwise, the schedule is called nonserial schedule.
● Serializable schedule:
○ A schedule S is serializable if it is equivalent to some serial
schedule of the same n transactions.
Serializability Definitions
● Result equivalent:
○ Two schedules are called result equivalent if they produce the
same final state of the database.
● Conflict equivalent:
○ Two schedules are said to be conflict equivalent if the order of
any two conflicting operations is the same in both schedules.
● Conflict serializable:
○ A schedule S is said to be conflict serializable if it is conflict
equivalent to some serial schedule S’.
Result Equivalent Schedules
S1 S2
read_item(X); read_item(X);
X:=X+10; X:=X*1.1;
write_item(X); write_item(X);
read_item(A):
write_item(B); write_item(A);
same order as in S1
read_item(B);
write_item(B);
read_item(B);
read_item(B); write_item(B);
write_item(B);
different order as in S1
B is conflict equivalent to A
⇒ B is serializable
Serial Vs Serializable
write_item(Y);
write_item(Z);
read_item(Z);
read_item(Y);
write_item(Y);
read_item(Y);
write_item(Y);
read_item(X);
write_item(X);
Precedence Graph for S
T X,Y T
1 2
Y Y,Z
T no cycles ⇒ S is serializable
3
Equivalent Serial Schedule:
T3 → T1 → T2
(precedence order)
Examples of serial and nonserial schedules
Constructing the Precedence Graphs
l FIGURE 17.7 Constructing the precedence graphs for schedules A and D from
Figure 17.5 to test for conflict serializability.
• (a) Precedence graph for serial schedule A.
• (b) Precedence graph for serial schedule B.
• (c) Precedence graph for schedule C (not serializable).
• (d) Precedence graph for schedule D (serializable, equivalent to schedule A).
Serializability Testing Example
Serializability Testing Example –
Schedule E
Serializability Testing Example –
Schedule F
5 Transaction Support in SQL
● A single SQL statement is always considered to be
atomic.
○ Either the statement completes execution without error or it fails
and leaves the database unchanged.
● With SQL, there is no explicit Begin Transaction
statement.
○ Transaction initiation is done implicitly when particular SQL
statements are encountered.
● Every transaction must have an explicit end statement,
which is either a COMMIT or ROLLBACK.
Sample SQL Transaction
EXEC SQL whenever sqlerror go to UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
DIAGNOSTICS SIZE 5
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT
INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY)
VALUES ('Robert','Smith','991004321',2,35000);
EXEC SQL UPDATE EMPLOYEE
SET SALARY = SALARY * 1.1
WHERE DNO = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;
THE_END: ...
Summary