DBMS Transaction 25.10.18
DBMS Transaction 25.10.18
1
Transaction
• What is Transaction?
• Write (A): Write operation Write(A) or W(A) writes the value back to
the database from buffer.
2
Transaction
• E.g. transaction to transfer 50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Two main issues to deal with:
1. Failures of various kinds, such as hardware failures and system Crashes
2. Concurrent execution of multiple transactions
3
Transaction
• Let us take a debit transaction from an account which consists of following
operations:
Operation 1: R(A);
Operation 2: A=A-1000;
Operation 3: W(A);
• Assume A’s value before starting of transaction is 5000.
• The first operation reads the value of A from database and stores it in a
buffer.
• Second operation will decrease its value by 1000. So buffer will contain
4000.
• Third operation will write the value from buffer to database. So A’s final
value will be 4000.
4
Transaction
• But it may also be possible that transaction may fail after executing some of its
operations. The failure can be because of hardware, software or power etc.
• For example, if debit transaction discussed above fails after executing operation
2, the value of A will remain 5000 in the database which is not acceptable by the
bank. To avoid this, Database has two important operations:
2. Rollback: If a transaction is not able to execute all operations successfully, all the
changes made by transaction are undone.
5
Transaction
1. Atomicity:
As a transaction is set of logically related operations, either all of them
should be executed or none. A debit transaction discussed above
should either execute all three operations or none. If debit transaction
fails after executing operation 1 and 2 then its new value 4000 will not
be updated in the database which leads to inconsistency.
7
Properties of
Transaction –
ACID Properties
10
• A’s value is updated to 4000 in database and then T2 writes the value
from its buffer back to database. A’s value is updated to 5500 which
shows that the effect of debit transaction is lost and database has
become inconsistent.
• To maintain consistency of database, we need concurrency control
protocols. The operations of T1 and T2 with their buffers and
database have been shown in Table 1.
11
Properties of
Transaction –
ACID Properties
12
Properties of Transaction – ACID Properties
3. Isolation: Result of a transaction should not be visible to others
before transaction is committed.
• For example, Let us assume that A’s balance is Rs. 5000 and T1 debits
Rs. 1000 from A. A’s new balance will be 4000. If T2 credits Rs. 500 to
A’s new balance, A will become 4500 and after this T1 fails. Then we
have to rollback T2 as well because it is using value produced by T1.
So a transaction results are not made visible to other transactions
before it commits.
13
Properties of Transaction – ACID Properties
• Isolation can be ensured trivially by running transactions serially that
is, one after the other. However, executing multiple transactions
concurrently has significant benefits.
14
Properties of Transaction – ACID Properties
• Let X= 500, Y = 500.
• Changes occurring in a particular transaction will not be visible to any
other transaction until that particular change in that transaction is
written to memory or has been committed.
15
• Suppose T has been executed till Read (Y) and then T’’ starts. As a
result , interleaving of operations takes place due to which T’’ reads
correct value of X but incorrect value of Y and sum computed by
• T’’: (X+Y = 50, 000+500=50, 500)
• is thus not consistent with the sum at end of transaction:
• T: (X+Y = 50, 000 + 450 = 50, 450).
• This results in database inconsistency, due to a loss of 50 units.
Hence, transactions must take place in isolation and changes should
be visible only after a they have been made to the main memory.
16
Properties of Transaction – ACID Properties
4. Durability: This property ensures that once the transaction has
completed execution, the updates and modifications to the database
are stored in and written to disk and they persist even is system failure
occurs. These updates now become permanent and are stored in a
non-volatile memory. The effects of the transaction, thus, are never
lost.
17
Implementation of Atomicity and Durability
• The recovery management component of a database system
implements the support for atomicity and durability.
E.g. the shadow-database scheme:
All updates are made on a shadow copy of the database
• db_pointer is made to point to the updated shadow copy after
– The transaction reaches partial commit and
– All updated pages have been flushed to disk.
Another variant is known as Shadow Paging.
18
Concurrent Executions
• Multiple transactions are allowed to run concurrently in the system.
• Advantages are:
1. Increased processor and disk utilization, leading to better transaction
throughput
E.g. one transaction can be using the CPU while another is reading from or
writing to the disk
2. Reduced average response time for transactions: short transactions need
not wait behind long ones.
• Concurrency control schemes – mechanisms to achieve isolation
That is, to control the interaction among the concurrent transactions in order
to prevent them from destroying the consistency of the database
19
Schedule
Schedule – a sequences of instructions that specify the chronological
order in which instructions of concurrent transactions are executed.
• A schedule for a set of transactions must consist of all instructions
of those transactions
• Must preserve the order in which the instructions appear in each
individual transaction.
• A transaction that successfully completes its execution will have a
commit instructions as the last statement
• By default transaction assumed to execute commit instruction as its
last step
• A transaction that fails to successfully complete its execution will have
an abort instruction as the last statement. 20
Schedule – Serial Schedule
• A schedule can be of two types:
21
Schedule – Serial Schedule
Let T1 transfer 50 from A to B, and T2 transfer 10% of the balance from
A to B.
• A serial schedule
in which T1 is followed by T2:
22
Schedule – Concurrent Schedule
23
Schedule – Concurrent Schedule
Let T1 and T2 be the transactions defined previously. The
following schedule is not a serial schedule, but it is equivalent to
Schedule 1.
24
Serializability
• Basic Assumption – Each transaction preserves database consistency.
• Thus serial execution of a set of transactions preserves database
consistency.
• A (possibly concurrent) schedule is serializable if it is equivalent to a
serial schedule. Different forms of schedule equivalence give rise to
the notions of:
1. Conflict serializability
2. View serializability
25
Serializability
• Simplified view of transactions:
26
Conflict Serializability
• As discussed in Concurrency control , serial schedules have less
resource utilization and low throughput. To improve it, two are more
transactions are run concurrently. But concurrency of transactions
may lead to inconsistency in database. To avoid this, we need to
check whether these concurrent schedules are serializable or not.
27
Conflict Serializability
• Conflicting operations: Two operations are said to be conflicting if all
conditions satisfy:
1. They belong to different transaction
2. They operation on same data item
3. At Least one of them is a write operation
28
Conflict Serializability - Conflicting Instructions
• Instructions li and lj of transactions Ti and Tj respectively, conflict if and
only if there exists some item Q accessed by both li and lj, and at least
one of these instructions wrote Q.
30
Conflict Serializability
Intermediate
Schedule 1 Steps Schedule 2
31
Conflict Equivalent
• If a schedule S can be transformed into a schedule S´ by a series of
swaps of non-conflicting instructions, we say that S and S´ are conflict
equivalent.
• We say that a schedule S is conflict serializable if it is conflict
equivalent to a serial schedule.
• The schedule which is conflict serializable is always conflict equivalent
to one of the serial schedule.
32
Testing for Conflict Serializability
• Precedence Graph or Serialization Graph is used commonly to
test Conflict Serializability of a schedule.
• The graph contains one node for each Transaction Ti. An edge ei is
of the form Tj –> Tk where Tj is the starting node of ei and Tk is the
ending node of ei. An edge ei is constructed between nodes Tj to Tk if
one of the operations in Tj appears in the schedule before some
conflicting operation in Tk .
33
Testing for Conflict Serializability
• The Algorithm can be written as:
• Create a node T in the graph for each participating transaction in the
schedule.
• For the conflicting operation read_item(X) and write_item(X) – If a
Transaction Tj executes a read_item (X) after Ti executes a write_item (X),
draw an edge from Ti to Tj in the graph.
• For the conflicting operation write_item(X) and read_item(X) – If a
Transaction Tj executes a write_item (X) after Ti executes a read_item (X),
draw an edge from Ti to Tj in the graph.
• For the conflicting operation write_item(X) and write_item(X) – If a
Transaction Tj executes a write_item (X) after Ti executes a write_item (X),
draw an edge from Ti to Tj in the graph.
• The Schedule S is serializable if there is no cycle in the precedence graph.
34
Testing for Conflict Serializability
• If there is no cycle in the precedence graph, it means we can
construct a serial schedule S’ which is conflict equivalent to the
schedule S.
• The serial schedule S’ can be found by Topological Sorting of the
acyclic precedence graph. Such schedules can be more than 1.
35
Testing for Conflict Serializability
• For example, Consider the schedule S :
S : r1(x) r1(y) w2(x) w1(x) r2(y)
• Creating Precedence graph:
1. Make two nodes corresponding to Transaction T1 and T2.
36
Testing for Conflict Serializability
• For example, Consider the schedule S :
S : r1(x) r1(y) w2(x) w1(x) r2(y)
• Creating Precedence graph:
2. For the conflicting pair r1(x) w2(x), where r1(x) happens before
w2(x), draw an edge from T1 to T2.
37
Testing for Conflict Serializability
• For example, Consider the schedule S :
S : r1(x) r1(y) w2(x) w1(x) r2(y)
• Creating Precedence graph:
3. For the conflicting pair w2(x) w1(x), where w2(x) happens before
w1(x), draw an edge from T2to T1.
38
Testing for Conflict Serializability
• Since the graph is cyclic, we can conclude that it is not conflict
serializable to any schedule serial schedule.
• Let us try to infer a serial schedule from this graph using topological
ordering.
• The edge T1–>T2 tells that T1 should come before T2 in the linear
ordering.
• The edge T2 –> T1 tells that T2 should come before T1 in the linear
ordering.
• So, we can not predict any particular order (when the graph is cyclic).
Therefore, no serial schedule can be obtained from this graph.
39
Testing for Conflict Serializability
• S1: r1(x) r3(y) w1(x) w2(y) w3(x) w2(x)
40
Problems of Concurrency
• Several problems can occur when concurrent transactions execute in
an uncontrolled manner.
1. The Lost Update Problem
2. The Temporary Update (or Dirty Read) Problem
3. The Incorrect Summary Problem
41
Problems of Concurrency
1. The Lost Update Problem: This problem occurs when two transactions
that access the same database items have their operations interleaved in
a way that makes the value of some database items incorrect.
Assume,
X=80, N=5, M=4
Final result Should be = 79
But the Final Result is = 84
42
Problems of Concurrency
• Suppose that transactions T1 and T2 are submitted at approximately the
same time, and suppose that their operations are interleaved as shown.
• Then the final value of item X is incorrect, because T2 reads the value of X
before T1 changes it in the database, and hence the updated value
resulting from T1 is lost.
• For example, if X = 80 at the start (originally there were 80 reservations on
the flight), N =5 (T) transfers 5 seat reservations from the flight
corresponding to X to the flight corresponding to Y), and M = 4 (T2 reserves
4 seats on X), the final result should be X = 79; but in the interleaving of
operations, it is X = 84 because the update in T1 that removed the five
seats from X was lost.
43
Problems of Concurrency
• Another Example of Lost Update Problem:
44
Problems of Concurrency
2. The Temporary Update (or Dirty Read) Problem: This problem occurs
when one transaction updates a database item and then the transaction fails
for some reason. The updated item is accessed by another transaction
before it is changed back to its original value.
45
Problems of Concurrency
2. The Temporary Update (or Dirty Read) Problem:
The value of item X that is read by T2 is called dirty data, because it has been
created by a transaction that has not completed and committed yet; hence,
this problem is also known as the dirty read problem.
46
Problems of Concurrency
3. The Incorrect Summary Problem:
If one transaction is calculating an
aggregate summary function on a
number of records while other
transactions are updating some of
these records, the aggregate
function may calculate some values
before they are updated and others
after they are updated.
47
Problems of Concurrency
• Another problem that may occur is called unrepeatable read, where
a transaction T reads an item twice and the item is changed by
another transaction T' between the two reads. Hence, T receives
different values for its two reads of the same item.
48
Types of Failures
1. A computer failure (system crash): A hardware, software, or network
error occurs in the computer system during transaction execution.
2. A transaction or system error: Some operation in the transaction may
cause it to fail, such as integer overflow or division by zero. Transaction
failure may also occur because of erroneous parameter values or
because of a logical programming error.
3. Local errors or exception conditions detected by the transaction: During
transaction execution, certain conditions may occur that necessitate
cancellation of the transaction. For example, data for the transaction
may not be found. This exception should be programmed in the
transaction itself, and hence would not be considered a failure.
49
Types of Failures
4. Concurrency control enforcement: The concurrency control method
may decide to abort the transaction, to be restarted later, because it
violates serializability or because several transactions are in a state of
deadlock.
5. Disk failure: Some disk blocks may lose their data because of a read
or write malfunction or because of a disk read/write head crash. This
may happen during a read or a write operation of the transaction.
6. Physical problems and catastrophes: This refers to an endless list of
problems that includes power or air-conditioning failure, fire, theft,
sabotage, overwriting disks or tapes by mistake, and mounting of a
wrong tape by the operator.
50
Types of Schedules
1. Serial Schedule
2. Complete Schedule
3. Recoverable Schedule
4. Cascadeless Schedule
5. Strict Schedule
51
Types of Schedules T1 T2
R(A)
1. Serial Schedule
W(A)
52
Types of Schedules T1 T2 T3
R(A)
2. Complete Schedule
W(A)
R(B)
Schedules in which the last operation
of each transaction is either abort (or) W(B)
commit
abort
53
T1 T2
Types of Schedules
R(A)
3. Recoverable Schedule
W(A)
commit
54
Types of Schedules
4. Cascadeless Schedule
4. Cascadeless Schedule
W(A)
56
T1 T2
R(A)
5. Strict Schedule
W(A)
57
View Serializability
Two schedules S1 and S2 are said to be view equal iff following below
conditions are satisfied :
1) Initial Read
If a transaction T2 reading data item A from initial database in S1 then
in S2 also T2 should read A from initial database.
58
View Serializability
2) Updated Read
If Ti is reading A which is updated by Tj in S1 then in S2 also Ti should
read A which is updated by Tj.
Above two schedule are not view as Final write operation in S1 is done
by T1 while in S2 done by T2.
60
View Serializability
• View Serializability: A Schedule is called view serializable if it is view
equal to a serial schedule (no overlapping transactions).
61
View Serializability
Let S and S´ be two schedules with the same set of transactions. S
and S´ are view equivalent if the following three conditions are met,
for each data item Q,
3. The transaction (if any) that performs the final write(Q) operation
in schedule S must also perform the final write(Q) operation in schedule
S’ 62
Test for View Serializability
• The problem of checking if a schedule is view serializable falls in the
class of NP-complete problems.
63
Concurrency Control
• Concurrency control techniques are used to ensure that the Isolation (or
non-interference) property of concurrently executing transactions is
maintained.
• Concurrency-control protocols : allow concurrent schedules, but ensure
that the schedules are conflict/view serializable, and are recoverable and
maybe even cascadeless.
• These protocols do not examine the precedence graph as it is being
created, instead a protocol imposes a discipline that avoids non-seralizable
schedules.
• Different concurrency control protocols provide different advantages
between the amount of concurrency they allow and the amount of
overhead that they impose.
64
Purpose of Concurrency Control
• To enforce ISOLATION.
• To preserve database consistency.
• To resolve READ-WRITE and WRITE-WRITE conflicts.
65
Concurrency Protocol - Different categories of
protocols: ASSIGNMENTS
Lock Based Protocol
Basic 2-PL
Conservative 2-PL
Strict 2-PL
Rigorous 2-PL
Graph Based Protocol
Time-Stamp Ordering Protocol
Multiple Granularity Protocol
Validation-Based Protocol
66
Lock Based Protocols
• A lock is a variable associated with a data item that describes a status
of data item with respect to possible operation that can be applied to
it. They synchronize the access by concurrent transactions to the
database items. It is required in this protocol that all the data items
must be accessed in a mutually exclusive manner. Let me introduce
you to two common locks which are used and some terminology
followed in this protocol.
67
Lock Based Protocols
1.Shared Lock (S): also known as Read-only lock. As the name
suggests it can be shared between transactions because while
holding this lock the transaction does not have the permission
to update data on the data item. S-lock is requested using lock-
S instruction.
69
Lock Based Protocols
• Upgrade / Downgrade locks : A transaction that holds a lock on an
item A is allowed under certain condition to change the lock state
from one state to another.
• Upgrade: A S(A) can be upgraded to X(A) if Ti is the only transaction
holding the S-lock on element A.
• Downgrade: We may downgrade X(A) to S(A) when we feel that we
no longer want to write on data-item A. As we were holding X-lock on
A, we need not check any conditions.
70
Lock Based Protocols
• Applying simple locking, we may not always produce Serializable
results, it may lead to Deadlock Inconsistency.
71
T1 T2
1 lock-X(B)
73
Deadlock, Starvation in DBMS
• Deadlock: A system is in a deadlock state if there exists a set of
transactions such that every transaction in the set is waiting for
another transaction in the set. In a database, a deadlock is an
unwanted situation in which two or more transactions are waiting
indefinitely for one another to give up locks.
74
Deadlock in DBMS - Example
• Suppose, Transaction T1 holds a lock on some rows in the Students
table and needs to update some rows in the Grades table.
Simultaneously, Transaction T2 holds locks on those very rows
(Which T1 needs to update) in the Grades table but needs to update
the rows in the Student table held by Transaction T1.
76
Deadlock in DBMS – Deadlock Avoidance
• When a database is stuck in a deadlock, It is always better to avoid
the deadlock rather than restarting or aborting the database.
• Deadlock avoidance method is suitable for smaller database whereas
deadlock prevention method is suitable for larger database.
• One method of avoiding deadlock is using application consistent logic.
In the above given example, Transactions that access Students and
Grades should always access the tables in the same order. In this way,
in the scenario described above, Transaction T1 simply waits for
transaction T2 to release the lock on Grades before it begins. When
transaction T2 releases the lock, Transaction T1 can proceed freely.
77
Deadlock in DBMS – Deadlock Avoidance
• Another method for avoiding deadlock is to apply both row level
locking mechanism and READ COMMITTED isolation level. However, It
does not guarantee to remove deadlocks completely.
Read Committed – This isolation level guarantees that any data read
is committed at the moment it is read. Thus it does not allows dirty
read. The transaction hold a read or write lock on the current row,
and thus prevent other rows from reading, updating or deleting it.
78
Deadlock in DBMS – Deadlock Detection
• When a transaction waits indefinitely to obtain a lock, The database
management system should detect whether the transaction is
involved in a deadlock or not.
79
Deadlock in DBMS – Deadlock Detection
80
Deadlock in DBMS – Deadlock Detection
Wait-for-graph:
• When a transaction Ti requests for a lock on an item, say X, which is
held by some other transaction Tj, a directed edge is created from Ti
to Tj. If Tj releases item X, the edge between them is dropped and Ti
locks the data item.
81
Deadlock in DBMS – Deadlock Detection
Transactions Data Items Lock Mode
• For Example, T1 Q Shared
T2 P Exclusive
Q Exclusive
T3 Q Shared
T4 P Exclusive
T3
T1
T2 T4
82
Deadlock in DBMS – Deadlock Detection
• Steps:
1. Initially draw a vertice for T1 (Q) which is in Shared Mode.
2. Draw a vertice for T2 (P,Q) – Now P is in Exclusive mode, whereas Q is
also in Exclusive Mode. But T1 has already Locked Q in Shared Mode, so
check compatability matrix of Shared-Exclusive, therefore lock can’t be
granted and T2 will wait for a resource Q held by T1 and draw an edge
from T2 to T1.
3. Now T3 (Q) is in Shared Mode, it waits for the data item Q to release the
locks but in compatibility matrix, Shared-Shared operation is allowed so
there won’t be any dependency for T3.
4. Like in Step 2, T4 will wait for a resource P held by T2 and draw an edge
from T4 to T2.
83
Deadlock in DBMS – Deadlock Detection
• Suppose if there is another transaction of
T4, Q, Exclusive
happens then there will be an edge from T4 to T1 as T1 is having Lock
on Q (Shared) and T4 wants to lock Q (Exclusive), which is not possible
from the compatibility matrix hence T4 will be dependent on T1.
84
Deadlock in DBMS – Deadlock Prevention
Wait-Die Scheme –
In this scheme, If a transaction request for a resource that is locked by other
transaction, then the DBMS simply checks the timestamp of both transactions and
allows the older transaction to wait until the resource is available for execution.
Suppose, there are two transactions T1 and T2 and Let timestamp of any
transaction T be TS (T). Now, If there is a lock on T2 by some other transaction and
T1 is requesting for resources held by T2, then DBMS performs following actions:
Checks if TS (T1) < TS (T2) – if T1 is the older transaction and T2 has held some
resource, then it allows T1 to wait until resource is available for execution. If T1 is
older transaction and has held some resource with it and if T2 is waiting for it, then
T2 is killed and restarted latter with random delay but with the same timestamp.
i.e. if the older transaction has held some resource and younger transaction waits
for the resource, then younger transaction is killed and restarted with very minute
delay with same timestamp.
This scheme allows the older transaction to wait but kills the younger one.
85
Deadlock in DBMS – Deadlock Prevention
Wound Wait Scheme –
In this scheme, if an older transaction requests for a resource held by
younger transaction, then older transaction forces younger transaction
to kill the transaction and release the resource. The younger
transaction is restarted with minute delay but with same timestamp. If
the younger transaction is requesting a resource which is held by older
one, then younger transaction is asked to wait till older releases it.
86
Deadlock in DBMS – Deadlock Prevention
Time Out Based Scheme – (Based on Lock Time outs)
• Reasons of Starvation –
1. If waiting scheme for locked items is unfair. ( priority queue )
2. Victim selection. ( same transaction is selected as a victim
repeatedly )
89
Deadlock in DBMS – Starvation Example
• Suppose there are 3 transactions namely T1, T2, and T3 in a database
that are trying to acquire a lock on data item ‘X’ . Now, suppose the
scheduler grants the lock to T1(may be due to some priority), and the
other two transactions are waiting for the lock. As soon as the
execution of T1 is over, another transaction T4 also come over and
request unlock on data item I. Now, this time the scheduler grants
lock to T4, and T2, T3 has to wait again . In this way if new
transactions keep on requesting the lock, T2 and T3 may have to wait
for an indefinite period of time, that leads to Starvation.
90
Deadlock in DBMS – Solutions To Starvation
• Increasing Priority – Starvation occurs when a transaction has to wait for an
indefinite time, In this situation we can increase the priority of that particular
transaction/s. But the drawback with this solution is that it may happen that the
other transaction may have to wait longer until the highest priority transaction
comes and proceeds.
• First Come First Serve approach – A fair scheduling approach i.e FCFS can be
adopted, In which the transaction can acquire a lock on an Item in the order, in
which the requested the lock.
• Wait die and wound wait scheme – These are the schemes that uses timestamp
ordering mechanism of transaction .
91
Deadlock in DBMS – Lock Based Protocols
• Implementing this lock system without any restrictions gives us the
Simple Lock based protocol (or Binary Locking), but it has its own
disadvantages, they does not guarantee Serializability. Schedules may
follow the preceding rules but a non-serializable schedule may result.
92
Deadlock in DBMS – Lock Based Protocols
• Two Phase Locking –
• A transaction is said to follow Two Phase Locking protocol if Locking
and Unlocking can be done in two phases.
• Growing Phase: New locks on data items may be acquired but none
can be released.
• Shrinking Phase: Existing locks may be released but no new locks can
be acquired.
• Note – If lock conversion is allowed, then upgrading of lock( from S(a)
to X(a) ) is allowed in Growing Phase and downgrading of lock (from
X(a) to S(a)) must be done in shrinking phase.
93
Crash Recovery – Failure Classification
• Transaction failure :
• Logical errors: transaction cannot complete due to some internal
error condition
• System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
94
Crash Recovery – Failure Classification
• Disk failure: a head crash or similar disk failure destroys all or part of disk
storage
• Destruction is assumed to be detectable: disk drives use
checksums to detect failures
95
Crash Recovery – Storage Structure
• Volatile storage:
• Does not survive system crashes
• Examples: main memory, cache memory
• Non-volatile storage:
• Survives system crashes
• Examples: disk, tape, flash memory, non-volatile (battery backed up) RAM
• Stable storage:
• A mythical form of storage that survives all failures
• Information residing in stable storage is never lost
• Approximated by maintaining multiple copies on distinct non-volatile media
96
Crash Recovery – Data Access
98
Crash Recovery – Log-Based Recovery
• The most widely used structure for recording database modifications
is the log. The log is a sequence of log records, recording all the
update activities in the database.
• There are several types of log records. An update log record describes
a single database write. It has these fields:
Transaction identifier is the unique identifier of the transaction that
performed the write operation.
Data-item identifier is the unique identifier of the data item written.
Typically, it is the location on disk of the data item.
Old value is the value of the data item prior to the write.
New value is the value that the data item will have after the write.
99
Crash Recovery – Log-Based Recovery
• A log is kept on stable storage.
• The log is a sequence of log records, and maintains a record of update
activities on the database.
• When transaction Ti starts, it registers itself by writing a <Ti start> log
record
• Before Ti executes write(X), a log record <Ti, X, V1, V2> is written,
where V1 is the value of X before the write, and V2 is the value to be
written to X.
• Log record notes that Ti has performed a write on data item Xj, Xj
had value V1 before the write, and will have value V2 after the write
100
Crash Recovery – Log-Based Recovery
• When Ti finishes it last statement, the log record <Ti commit> is
written.
• We assume for now that log records are written directly to stable
storage (that is, they are not buffered)
• Two approaches using logs
Deferred database modification
Immediate database modification
101
Crash Recovery – Log-Based Recovery
• When Ti finishes it last statement, the log record <Ti commit> is
written.
• We assume for now that log records are written directly to stable
storage (that is, they are not buffered)
• Two approaches using logs
Deferred database modification
Immediate database modification
102
Crash Recovery – Deferred Database
Modification
• The deferred-modification technique ensures transaction atomicity by
recording all database modifications in the log, but deferring the
execution of all write operations of a transaction until the transaction
partially commits.
• Recall that a transaction is said to be partially committed once the
final action of the transaction has been executed. The version of the
deferred-modification technique that we describe in this section
assumes that transactions are executed serially.
103
Crash Recovery – Deferred Database
Modification
• The deferred database modification scheme records all
modifications to the log, but defers all the writes to after partial
commit.
• Assume that transactions execute serially
• Transaction starts by writing <Ti start> record to log.
• A write(X) operation results in a log record <Ti, X, V> being written,
where V is the new value for X
104
Crash Recovery – Deferred Database
Modification
• The write is not performed on X at this time, but is deferred.
• When Ti partially commits, <Ti commit> is written to the log
• Finally, the log records are read and used to actually execute the
previously deferred writes.
• During recovery after a crash, a transaction needs to be redone if and
only if both <Ti start> and<Ti commit> are there in the log.
• Redoing a transaction Ti ( redo Ti) sets the value of all data items
updated by the transaction to the new values.
105
Crash Recovery – Deferred Database
Modification
• Crashes can occur while
The transaction is executing the original updates, or
While recovery action is being taken
For example,
• Value of A = 1000, B = 2000, C = 700
• Transactions T0 and T1 (T0 executes before T1)
106
Crash Recovery – Deferred Database
Modification
107
Crash Recovery – Deferred Database
Modification
109
Crash Recovery – Immediate Database
Modification
• The immediate database modification scheme allows database
updates of an uncommitted transaction to be made as the writes are
issued
● since undoing may be needed, update logs must have both old
value and new value
• Update log record must be written before database item is written
● We assume that the log record is output directly to stable storage
• Output of updated blocks can take place at any time before or after
transaction commit
• Order in which blocks are output can be different from the order in
which they are written.
110
Crash Recovery – Immediate Database
Modification
● redo(Ti) sets the value of all data items updated by Ti to the new
values, going forward from the first log record for Ti
111
Crash Recovery – Immediate Database
Modification
• When recovering after failure:
• Transaction Ti needs to be undone if the log contains the record
<Ti start>, but does not contain the record <Ti commit>.
• Transaction Ti needs to be redone if the log contains both the record
<Ti start> and the record <Ti commit>.
• Undo operations are performed first, then redo operations.
112
Crash Recovery – Immediate Database
Modification
115
Crash Recovery – Checkpoints
• During recovery we need to consider only the most recent transaction Ti
that started before the checkpoint, and transactions that startedafter Ti.
1. Scan backwards from end of log to find the most recent
<checkpoint> record
2. Continue scanning backwards till a record <Ti start> is found.
3. Need only consider the part of log following above start record. Earlier
part of log can be ignored during recovery, and can be erased whenever
desired.
4. For all transactions (starting from Ti or later) with no <Ti commit>,
execute undo(Ti). (Done only in case of immediate modification.)
5. Scanning forward in the log, for all transactions starting
from Ti or later with a <Ti commit>, execute redo(Ti).
116
Crash Recovery – Shadow Paging
• Requires few disk access than do-log methods.
• Maintain two page tables during the life cycle of Transaction.
• When transaction starts both page tables are identical.
• Shadow page table is never changed over duration of Transaction.
• Current page table may changed during write operation.
• All input and output operations use the current page table to locate
database pages on disk.
• Store shadow page table in non-volatile storage.
117
Crash Recovery – Shadow Paging
• Note: When transaction commits system writes current page
table to non-volatile storage. The current page table then
becomes new shadow page table.
118
Crash Recovery – Shadow Paging
• Advantages of Shadow paging over log-based techniques:
• Log record overhead is removed.
• Faster recovery (No UNDO-REDO operations)
• Drawbacks of Shadow paging:
• Commit overhead - (Actual data blocks, current page table, disk
address of current page table).
• Data fragmentation – Locality property is lost. Shadow paging
causes database pages to change location.
• Garbage Collection – When transaction commits, database pages
containing old version of data changed by transaction become
inaccessible.
119
Challenges of database security
• 1. Data quality –
• The database community basically needs techniques and some
organizational solutions to assess the quality of data. These techniques
may include the simple mechanism such as quality stamps that are posted
on different websites. We also need techniques that will provide us more
effective integrity semantics verification tools for assessment of data
quality, based on many techniques such as record linkage.
• We also need application-level recovery techniques to automatically repair
the incorrect data.
• The ETL (Extract, Transform, Load) that is extracted transform and load
tools widely used for loading the data in the data warehouse are presently
grappling with these issues.
120
Challenges of database security
• 2. Intellectual property rights –
• As the use of Internet and intranet is increasing day by day, legal and
informational aspects of data are becoming major concerns for many
organizations. To address this concerns watermark technique are used
which will help to protect content from unauthorized duplication and
distribution by giving the provable power to the ownership of the content.
• Traditionally they are dependent upon the availability of a large domain
within which the objects can be altered while retaining its essential or
important properties.
• However, research is needed to access the robustness of many such
techniques and the study and investigate many different approaches or
methods that aimed to prevent intellectual property rights violation.
121
Challenges of database security
• 3. Database survivability –
• Database systems need to operate and continued their functions even with
the reduced capabilities, despite disruptive events such as information
warfare attacks
• A DBMS in addition to making every effort to prevent an attack and
detecting one in the event of the occurrence should be able to do the
following:
• Confident:
• We should take immediate action to eliminate the attacker’s access to the
system and to isolate or contain the problem to prevent further spread.
• Damage assessment:
• Determine the extent of the problem, including failed function and
corrupted data.
122
Challenges of database security
• 3. Database survivability –
• Recover:
• Recover corrupted or lost data and repair or reinstall failed function
to re-establish a normal level of operation.
• Reconfiguration:
• Reconfigure to allow the operation to continue in a degraded mode
while recovery proceeds.
• Fault treatment:
• To the extent possible, identify the weakness exploited in the attack
and takes steps to prevent a recurrence.
123