BCSE302L-Database Systems Module - 5
BCSE302L-Database Systems Module - 5
BCSE302L-Database Systems Module - 5
Database Systems
Dr.M.Revathi
Assistant Professor (Sr) / SCOPE
VIT Chennai
[email protected]
1
Transaction Processing
• TransactionsCollections of operations that form a single logical unit
of work
• A database system must ensure proper execution of transactions
despite failures—either the entire transaction executes, or none of it
does.
• Transaction processing systems systems with large databases and
hundreds of concurrent users executing database transactions
Transaction Processing
Transaction Concept
• A transaction is a unit of program execution that accesses and updates
various data items.
• Initiated by a user program written in a high-level data-manipulation language (typically
SQL), or programming language (for example, C++, or Java), with embedded database
accesses in JDBC or ODBC.
• Delimited by statements (or function calls) of the form begin transaction
and end transaction
• Consists of all operations executed between the begin transaction and end
transaction
• This collection of steps must appear to the user as a single, indivisible unit.
• Since a transaction is indivisible, it either executes in its entirety or not at all. 4
2
Transaction Processing
Transaction Concept
• A single application program may contain more than one transaction if it
contains several transaction boundaries.
• Read-only transaction
• Do not update the database but only retrieve data
• Read-write transaction
• Updates the database
• A database is basically represented as a collection of named data items.
• The size of a data item is called its granularity.
• A data item database record, a whole disk block, or even a an individual field
(attribute) value of some record in the database.
5
Transaction Processing
ACID Properties
Atomicity
• Either all operations of the transaction are reflected properly in the database, or none are
Consistency
• Execution of a transaction in isolation (that is, with no other transaction executing
concurrently) preserves the consistency of the database
Isolation
• Each transaction is unaware of other transactions executing concurrently in the system.
Durability
• After a transaction completes successfully, the changes it has made to the database
persist, even if there are system failures.
6
3
Transaction Processing
A Simple Transaction Model
Transactions access data using two operations:
read(X)
• Transfers the data item X from the database to a variable, also called X, in a buffer
in main memory belonging to the transaction that executed the read operation.
write(X)
• Transfers the value in the variable X in the main-memory buffer of the transaction
that executed the write to the data item X in the database.
Transaction Processing
A Simple Transaction Model
Example:
Consider a simple bank application consisting of several accounts and a set of
transactions that access and update those accounts. Consider Rs.50 is transferred
from account A to account B
Ti be a transaction that transfers 50 from account A to account B
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B) 8
4
Transaction Processing
A Simple Transaction Model
Consistency:
• Consistency requirement:
• The sum of A and B is unchanged by the execution of the transaction
• Consistency requirements include
• Explicitly specified integrity constraints such as primary keys and foreign keys
• Implicit integrity constraints
• e.g. sum of balances of all accounts, minus sum of loan amounts must equal value of cash-in-hand
Transaction Processing
A Simple Transaction Model
Atomicity:
• Atomicity requirement:
• If the transaction fails after step 3 and before step 6,
money will be “lost” leading to an inconsistent database
state.
• Failure could be due to software or hardware.
• The system should ensure that updates of a partially
executed transaction are not reflected in the database.
• Ensuring atomicity is the responsibility of the database
system handled by the recovery system
10
5
Transaction Processing
A Simple Transaction Model
Durability:
• Durability requirement:
• Once the user has been notified that the transaction has been
completed (i.e., the transfer of the 50 has taken place), the
updates to the database by the transaction must persist even if
there are software or hardware failures.
• Recovery system of the database is responsible for ensuring
durability
11
Transaction Processing
A Simple Transaction Model
Isolation:
• Isolation requirement — if between steps 3 and 6, another transaction T2 is allowed to access
the partially updated database, it will see an inconsistent database (the sum A + B will be less
than it should be).
• T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B)
• Isolation can be ensured trivially by running transactions serially that is, one after the other
• Isolation is ensured by concurrency-control system 12
6
Transaction Processing
Transaction States
13
Transaction Processing
Transaction States
• Transaction may not always complete its execution successfully transaction
is termed aborted
• Any changes that the aborted transaction made to the database must be
undone.
• Once the changes caused by an aborted transaction have been undone,
the transaction has been rolled back
• A transaction that completes its execution successfully is said to be
committed.
• A committed transaction that has performed updates transforms the database into a
new consistent state, which must persist even if there is a system failure.
• Once a transaction has committed, we cannot undo its effects by aborting it.
• The only way to undo the effects of a committed transaction is to execute a
compensating transaction. 14
7
Transaction Processing
Transaction States
• Active the initial state; the transaction stays in this state while it is
executing.
• Partially committed after the final statement has been executed.
• Failed after the discovery that normal execution can no longer
proceed.
• Abortedafter the transaction has been rolled back and the database
has been restored to its state prior to the start of the transaction.
• Committedafter successful completion
• A transaction is said to have terminated if it has either committed or
aborted. 15
Transaction Processing
Transaction States
At the aborted state, the system has two options
• It can restart the transactionif the transaction was aborted as a result
of some hardware or software error that was not created through the
internal logic of the transaction. A restarted transaction is considered
to be a new transaction.
• It can kill the transactionbecause of some internal logical error that
can be corrected only by rewriting the application program, or
because the input was bad, or because the desired data were not
found in the database.
16
8
Transaction Processing
Schedules
• Schedule – A sequence of instructions that specify the chronological
order in which instructions of concurrent transactions are executed
• A schedule for a set of transactions must consist of all instructions of those
transactions
• Must preserve the order in which the instructions appear in each individual
transaction.
• A transaction that successfully completes its execution will have a
commit instructions as the last statement
• By default transaction assumed to execute commit instruction as its last step
• A transaction that fails to successfully complete its execution will have
an abort instruction as the last statement 17
Transaction Processing
Schedules
• A schedule S is serial if, for every transaction T participating in the
schedule, all the operations of T are executed consecutively in the
schedule;
• Otherwise, the schedule is called nonserial.
• In a serial schedule, only one transaction at a time is active
• the commit (or abort) of the active transaction initiates execution of the next
transaction.
• No interleaving occurs in a serial schedule
18
9
Transaction Processing
Schedules
• Let T1 transfer 50 from A to B, and T2 transfer 10% of the balance from A to B.
• A serial schedule in which T1 is followed by T2 :
Schedule 1 19
Transaction Processing
Schedules
• A serial schedule in which T2 is followed by T1 :
Schedule 2 20
10
Transaction Processing
Schedules
• The following schedule is not a serial
schedule, but it is equivalent to Schedule 1 :
21
Schedule 3
Transaction Processing
Schedules
• The following concurrent schedule does not
preserve the value of (A + B )-a concurrent
schedule resulting in an inconsistent state
Schedule 4 22
11
Serializability
• Basic Assumption – Each transaction preserves database
consistency.
• Serial execution of transactions preserves database
consistency.
• A schedule is serializable if it is equivalent to a serial
schedule. Different forms of schedule equivalence:
1. conflict serializability
2. view serializability
• Simplified view of transactions
• Ignore operations other than read and write instructions
• Assume that transactions may perform arbitrary
Schedule 3—showing
computations on data in local buffers in between reads only the read and write
23
and writes. instructions
Serializability
• Conflicting Instructions
• Instructions li and lj of transactions Ti and Tj respectively, conflict if and only if
there exists some item Q accessed by both li and lj, and at least one of these
instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict(order of I and J matters)
3. li = write(Q), lj = read(Q). They conflict (order of I and J matters)
4. li = write(Q), lj = write(Q). They conflict
• A conflict between li and lj forces a temporal order between them.
• If li and lj are consecutive in a schedule and they do not conflict, their results
would remain the same even if they had been interchanged in the schedule.
24
12
Serializability
Conflict Serializability
• If a schedule S can be transformed into a schedule S´ by a series of swaps of non-
conflicting instructionsS and S´ are conflict equivalent.
• A schedule S is conflict serializable if it is conflict equivalent to a serial schedule
25
Serializability
Conflict Serializability
13
Serializability
Conflict Serializability
• Continue to swap nonconflicting instructions
• Schedule 3 can be transformed into Schedule 6, a
serial schedule where T2 follows T1, by series of
swaps of non-conflicting instructions.
• Schedule 3 is conflict serializable.
Serializability
Conflict Serializability
• Example of a schedule that is not conflict serializable:
Schedule 7.
14
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
• Consider a schedule S
• Construct a directed graph, called a precedence graph, from S.
• Consists of a pair G = (V, E), where V is a set of vertices and E is a set of
edges.
• The set of vertices consists of all the transactions participating in the
schedule.
29
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
• The set of edges consists of all edges Ti → Tj for which one of three
conditions holds:
1. Ti executes write(Q) before Tj executes read(Q).
2. Ti executes read(Q) before Tj executes write(Q).
3. Ti executes write(Q) before Tj executes write(Q).
• If an edge Ti → Tj exists in the precedence graph, then, in any serial schedule
S’ equivalent to S, Ti must appear before Tj.
30
15
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
31
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
32
16
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
• If the precedence graph for S has a cycle, then schedule S is not conflict
serializable.
• If the graph contains no cycles, then the schedule S is conflict serializable.
33
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
T1 T2
17
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
35
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
T1 T2
X
Y Y, Z
T3
36
18
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
• If precedence graph is acyclic, the serializability order
can be obtained by a topological sorting of the graph.
• Linear order consistent with the partial order of the
graph.
Topological sorting 37
Serializability
Conflict Serializability
Consider the precedence graph of Figure. Is the corresponding schedule conflict
serializable?
38
19
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
X,Y
T1 T2
Y, Z
Y
T3
39
Serializability
Conflict Serializability
• Determining conflict serializability of a schedule
40
Schedule 8.
20
Serializability
View Serializability
• Let S and S´ be two schedules with the same set of transactions.
• S and S´ are view equivalent if the following three conditions are met, for
each data item Q,
1. If in schedule S, transaction Ti reads the initial value of Q, then in
schedule S’ also transaction Ti must read the initial value of Q.
2. If in schedule S transaction Ti executes read(Q), and that value was
produced by transaction Tj (if any), then in schedule S’ also transaction Ti
must read the value of Q that was produced by the same write(Q)
operation of transaction Tj .
3. The transaction (if any) that performs the final write(Q) operation in
schedule S must also perform the final write(Q) operation in schedule S’.
41
Serializability
View Serializability
• View equivalence is also based purely on reads and writes alone.
• A schedule S is view serializable if it is view equivalent to a serial schedule.
• Every conflict serializable schedule is also view serializable.
• Every view serializable schedule that is not conflict serializable has blind
writes.
Schedule is view equivalent to the serial
schedule <T3, T4, T6>,
since the one read(Q) instruction reads the
initial value of Q in both schedules and T6
performs the final write of Q in both
schedules. 42
21
Serializability
View Serializability
Check whether the schedule is view serializable or not? T1 T2 T3
R2(B)
S : R2(B); R2(A); R1(A); R3(A); W1(B); W2(B); W3(B); R2(A)
R1(A)
Sol: R3(A)
W1(B)
With 3 transactions, total number of schedules possible W2(B)
=6 Since the final update on B is made by T3, W3(B)
44
22
Schedules based on recoverability
Recoverable schedule:
• If a transaction Tj reads a data item previously written by a transaction
Ti, then the commit operation of Ti appears before the commit
operation of Tj.
• The following schedule is not recoverable if T7 commits immediately
after the read before T6 commits
45
46
23
Schedules based on recoverability
Recoverable schedule:
• Cascading rollback
• Consider the following schedule where
none of the transactions has yet committed
If T8 fails
• T8 must be rolled back.
• Since T9 is dependent on T8, T9 must be rolled
back.
• Since T10 is dependent on T9, T10 must be
rolled back
47
24
Recovery Concepts
Why recovery is needed?
1. A computer failure (system crash)
2. A transaction or system error
3. Local errors or exception conditions
4. Concurrency control enforcement
5. Disk failure
6. Physical problems and catastrophes
49
Recovery Concepts
Recovery Algorithms
• Recovery algorithms ensure database consistency and transaction
atomicity and durability despite failures
Recovery algorithms have two parts
• Actions taken during normal transaction processing information
exists to recover from failures
• Actions taken after a failure to recover the database contents to
ensure atomicity, consistency and durability
50
25
Recovery Concepts
Log-Based Recovery
• A log is kept on stable storage
• The log is a sequence of log records, and maintains a record of
update activities on the database.
• When transaction Ti starts, it registers itself by writing a
<Ti start>log record
• Before Ti executes write(X), a log record <Ti, X, V1, V2> is written
• When Ti finishes it last statement, the log record <Ti commit> is
written.
51
Recovery Concepts
Log-Based Recovery
• Assume that log records are written directly to stable storage
• Two approaches using logs
• Deferred database modification
• Immediate database modification
52
26
Recovery Concepts
Deferred database modification
• The deferred database modification scheme records all modifications
to the log, but defers all the writes after commit.
• Assume that transactions execute serially
• Transaction starts by writing <Ti start> record to log.
• A write(X) operation results in a log record <Ti, X, V> being written,
where V is the new value for X
• The write is not performed on X at this time, but is deferred.
• When Ti commits, <Ti commit> is written to the log
• The log records are read and used to execute the previously deferred 53
writes
Recovery Concepts
Deferred database modification
• During recovery after a crash, a transaction needs to be redone if and
only if both <Ti start> and<Ti commit> are there in the log.
• Redoing a transaction Ti ( redoTi) sets the value of all data items
updated by the transaction to the new values.
• Crashes can occur while
• the transaction is executing the original updates, or
• while recovery action is being taken
54
27
Recovery Concepts
Deferred database modification
• Example: Consider transactions T0 and T1 (T0 executes before T1):
T0: read (A) T1 : read (C)
A: - A - 50 C:- C- 100
Write (A) write (C)
read (B)
B:- B + 50
write (B)
55
Recovery Concepts
Deferred database modification
56
28
Recovery Concepts
Deferred database modification
• After a system crash has occurred, the system consults the log to
determine which transactions need to be redone,
• Transaction Ti needs to be redone if the log contains the record <Ti
start> and either the record <Ti commit> or the record <Ti abort>.
57
Recovery Concepts
Deferred database modification
29
Recovery Concepts
Immediate Database Modification
• The immediate database modification scheme allows database updates
of an uncommitted transaction to be made as the writes are issued
• Update logs must have both old value and new value
• Update log record must be written before database item is written
• Recovery procedure has two operations instead of one:
• undo(Ti) restores the value of all data items updated by Ti to their old
values, going backwards from the last log record for Ti
• redo(Ti) sets the value of all data items updated by Ti to the new values,
going forward from the first log record for Ti
59
Recovery Concepts
Immediate Database Modification
• When recovering after failure:
• Transaction Ti needs to be undone if the log contains the record <Ti
start>, but does not contain the record <Ti commit>.
• Transaction Ti needs to be redone if the log contains both the
record <Ti start> and the record <Ti commit>.
• Undo operations are performed first, then redo operations
60
30
Recovery Concepts
Immediate Database Modification
Recovery Concepts
Immediate Database Modification
31
Recovery Concepts
Checkpoints
• Problems in recovery procedure using log:
1. searching the entire log is time-consuming
2. Unnecessarily redo transactions which have already output their
updates to the database.
• Streamline recovery procedure by periodically performing
checkpointing
• Write a log record < checkpoint> onto stable storage
63
Recovery Concepts
Checkpoints
32
Recovery Concepts
Shadow Paging Algorithm
• The AFIM does not overwrite its BFIM but recorded at another place on the
disk.
• Old value of the data item before updating is called the before image (BFIM)
• The new value after updating is called the after image (AFIM)
• At any time a data item has AFIM and BFIM (Shadow copy of the data item) at
two different places on the disk.
X Y
X' Y'
Database
X and Y: Shadow copies of data items
X' and Y': Current copies of data items 65
Recovery Concepts
Shadow Paging Algorithm
• To manage access of data items by concurrent transactions two directories
(current and shadow) are used.
• NO-UNDO/NO-REDO technique for recovery
66
33