0% found this document useful (0 votes)
28 views123 pages

DBMS Transaction 25.10.18

Uploaded by

Upakul Kalita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views123 pages

DBMS Transaction 25.10.18

Uploaded by

Upakul Kalita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

DBMS (Module 4)

Himanish Shekhar Das


Assistant Professor
Department of Computer Science and Engineering
Email: [email protected]
Contact: 9435427610

1
Transaction
• What is Transaction?

• A set of logically related operations is known as transaction. The main


operations of a transaction are:

• Read(A): Read operations Read(A) or R(A) reads the value of A from


the database and stores it in a buffer in main memory.

• Write (A): Write operation Write(A) or W(A) writes the value back to
the database from buffer.
2
Transaction
• E.g. transaction to transfer 50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Two main issues to deal with:
1. Failures of various kinds, such as hardware failures and system Crashes
2. Concurrent execution of multiple transactions

3
Transaction
• Let us take a debit transaction from an account which consists of following
operations:
Operation 1: R(A);
Operation 2: A=A-1000;
Operation 3: W(A);
• Assume A’s value before starting of transaction is 5000.
• The first operation reads the value of A from database and stores it in a
buffer.
• Second operation will decrease its value by 1000. So buffer will contain
4000.
• Third operation will write the value from buffer to database. So A’s final
value will be 4000.
4
Transaction
• But it may also be possible that transaction may fail after executing some of its
operations. The failure can be because of hardware, software or power etc.

• For example, if debit transaction discussed above fails after executing operation
2, the value of A will remain 5000 in the database which is not acceptable by the
bank. To avoid this, Database has two important operations:

1. Commit: After all instructions of a transaction are successfully executed, the


changes made by transaction are made permanent in the database.

2. Rollback: If a transaction is not able to execute all operations successfully, all the
changes made by transaction are undone.

5
Transaction

Fig. State Transition Diagram Illustrating the states for Transaction


Execution
6
Properties of Transaction – ACID Properties
• To ensure integrity of the data, we require that the database system
maintain the following properties of the transactions:

1. Atomicity:
As a transaction is set of logically related operations, either all of them
should be executed or none. A debit transaction discussed above
should either execute all three operations or none. If debit transaction
fails after executing operation 1 and 2 then its new value 4000 will not
be updated in the database which leads to inconsistency.

7
Properties of
Transaction –
ACID Properties

• Consider the transaction T consisting of T1 and T2: Transfer of 100


from account X to account Y.
• If the transaction fails after completion of T1 but before completion
of T2.( say, after write(X) but before write(Y)), then amount has been
deducted from X but not added to Y. This results in an inconsistent
database state. Therefore, the transaction must be executed in
entirety in order to ensure correctness of database state.
8
Properties of Transaction – ACID Properties
2. Consistency: If operations of debit and credit transactions on same
account are executed concurrently, it may leave database in an
inconsistent state.
T1’s T2’s
T1 buffer T2 Buffer Database
space Space
A=5000
R(A); A=5000 A=5000
A=5000 R(A); A=5000 A=5000
A=A-1000; A=4000 A=5000 A=5000
A=4000 A=A+500; A=5500
W(A); A=5500 A=4000
W(A); A=5500
9
• For Example, T1 (debit of Rs. 1000 from A) and T2 (credit of 500 to
A) executing concurrently, the database reaches inconsistent
state.
• Let us assume Account balance of A is Rs. 5000. T1 reads A(5000)
and stores the value in its local buffer space. Then T2 reads
A(5000) and also stores the value in its local buffer space.
• T1 performs A=A-1000 (5000-1000=4000) and 4000 is stored in T1
buffer space. Then T2 performs A=A+500 (5000+500=5500) and
5500 is stored in T2 buffer space. T1 writes the value from its
buffer back to database.

10
• A’s value is updated to 4000 in database and then T2 writes the value
from its buffer back to database. A’s value is updated to 5500 which
shows that the effect of debit transaction is lost and database has
become inconsistent.
• To maintain consistency of database, we need concurrency control
protocols. The operations of T1 and T2 with their buffers and
database have been shown in Table 1.

11
Properties of
Transaction –
ACID Properties

• Consider the following transaction T consisting of T1 and T2: Transfer of


100 from account X to account Y.
• The total amount before and after the transaction must be maintained.
• Total before T occurs = 500 + 200 = 700.
• Total after T occurs = 400 + 300 = 700.
• Therefore, database is consistent. Inconsistency occurs in case T1
completes but T2 fails. As a result T is incomplete.

12
Properties of Transaction – ACID Properties
3. Isolation: Result of a transaction should not be visible to others
before transaction is committed.

• For example, Let us assume that A’s balance is Rs. 5000 and T1 debits
Rs. 1000 from A. A’s new balance will be 4000. If T2 credits Rs. 500 to
A’s new balance, A will become 4500 and after this T1 fails. Then we
have to rollback T2 as well because it is using value produced by T1.
So a transaction results are not made visible to other transactions
before it commits.

13
Properties of Transaction – ACID Properties
• Isolation can be ensured trivially by running transactions serially that
is, one after the other. However, executing multiple transactions
concurrently has significant benefits.

14
Properties of Transaction – ACID Properties
• Let X= 500, Y = 500.
• Changes occurring in a particular transaction will not be visible to any
other transaction until that particular change in that transaction is
written to memory or has been committed.

15
• Suppose T has been executed till Read (Y) and then T’’ starts. As a
result , interleaving of operations takes place due to which T’’ reads
correct value of X but incorrect value of Y and sum computed by
• T’’: (X+Y = 50, 000+500=50, 500)
• is thus not consistent with the sum at end of transaction:
• T: (X+Y = 50, 000 + 450 = 50, 450).
• This results in database inconsistency, due to a loss of 50 units.
Hence, transactions must take place in isolation and changes should
be visible only after a they have been made to the main memory.

16
Properties of Transaction – ACID Properties
4. Durability: This property ensures that once the transaction has
completed execution, the updates and modifications to the database
are stored in and written to disk and they persist even is system failure
occurs. These updates now become permanent and are stored in a
non-volatile memory. The effects of the transaction, thus, are never
lost.

17
Implementation of Atomicity and Durability
• The recovery management component of a database system
implements the support for atomicity and durability.
E.g. the shadow-database scheme:
All updates are made on a shadow copy of the database
• db_pointer is made to point to the updated shadow copy after
– The transaction reaches partial commit and
– All updated pages have been flushed to disk.
Another variant is known as Shadow Paging.

18
Concurrent Executions
• Multiple transactions are allowed to run concurrently in the system.
• Advantages are:
1. Increased processor and disk utilization, leading to better transaction
throughput
E.g. one transaction can be using the CPU while another is reading from or
writing to the disk
2. Reduced average response time for transactions: short transactions need
not wait behind long ones.
• Concurrency control schemes – mechanisms to achieve isolation
That is, to control the interaction among the concurrent transactions in order
to prevent them from destroying the consistency of the database
19
Schedule
Schedule – a sequences of instructions that specify the chronological
order in which instructions of concurrent transactions are executed.
• A schedule for a set of transactions must consist of all instructions
of those transactions
• Must preserve the order in which the instructions appear in each
individual transaction.
• A transaction that successfully completes its execution will have a
commit instructions as the last statement
• By default transaction assumed to execute commit instruction as its
last step
• A transaction that fails to successfully complete its execution will have
an abort instruction as the last statement. 20
Schedule – Serial Schedule
• A schedule can be of two types:

• Serial Schedule: When one transaction completely executes before


starting another transaction, the schedule is called serial schedule. A
serial schedule is always consistent.
• Example, If a schedule S has debit transaction T1 and credit
transaction T2, possible serial schedules are T1 followed by T2 (T1-
>T2) or T2 followed by T1 ((T1->T2). A serial schedule has low
throughput and less resource utilization.

21
Schedule – Serial Schedule
Let T1 transfer 50 from A to B, and T2 transfer 10% of the balance from
A to B.

• A serial schedule
in which T1 is followed by T2:

22
Schedule – Concurrent Schedule

• Concurrent Schedule: When operations of a transaction are


interleaved with operations of other transactions of a schedule, the
schedule is called Concurrent schedule.

• Example: Schedule of debit and credit transaction shown in Table 1 is


concurrent in nature. But concurrency can lead to inconsistency in
database. The above example of concurrent schedule is also
inconsistent.

23
Schedule – Concurrent Schedule
Let T1 and T2 be the transactions defined previously. The
following schedule is not a serial schedule, but it is equivalent to
Schedule 1.

• Schedule 2 – A concurrent schedule


equivalent to schedule 1.

24
Serializability
• Basic Assumption – Each transaction preserves database consistency.
• Thus serial execution of a set of transactions preserves database
consistency.
• A (possibly concurrent) schedule is serializable if it is equivalent to a
serial schedule. Different forms of schedule equivalence give rise to
the notions of:
1. Conflict serializability
2. View serializability

25
Serializability
• Simplified view of transactions:

1. We ignore operations other than read and write instructions


2. We assume that transactions may perform arbitrary computations
on data in local buffers in between reads and writes.
3. Our simplified schedules consist of only read and write instructions.

26
Conflict Serializability
• As discussed in Concurrency control , serial schedules have less
resource utilization and low throughput. To improve it, two are more
transactions are run concurrently. But concurrency of transactions
may lead to inconsistency in database. To avoid this, we need to
check whether these concurrent schedules are serializable or not.

• Conflict Serializable: A schedule is called conflict serializable if it can


be transformed into a serial schedule by swapping non-conflicting
operations.

27
Conflict Serializability
• Conflicting operations: Two operations are said to be conflicting if all
conditions satisfy:
1. They belong to different transaction
2. They operation on same data item
3. At Least one of them is a write operation

28
Conflict Serializability - Conflicting Instructions
• Instructions li and lj of transactions Ti and Tj respectively, conflict if and
only if there exists some item Q accessed by both li and lj, and at least
one of these instructions wrote Q.

1. li = read(Q), lj = read(Q). li and lj don’t conflict.


2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict.
4. li = write(Q), lj = write(Q). They conflict.

• Intuitively, a conflict between li and lj forces a (logical) temporal order


between them.

• If li and lj are consecutive in a schedule and they do not conflict,


their results would remain the same even if they had been
interchanged in the schedule.
29
Conflict Serializability - Conflicting Instructions
• Conflicting operations pair (R1(A), W2(A)) because they belong to two
different transactions on same data item A and one of them is write
operation.
• Similarly, (W1(A), W2(A)) and (W1(A), R2(A)) pairs are also conflicting.
• On the other hand, (R1(A), W2(B)) pair is non-conflicting because they
operate on different data item.
• Similarly, ((W1(A), W2(B)) pair is non-conflicting.

30
Conflict Serializability
Intermediate
Schedule 1 Steps Schedule 2

31
Conflict Equivalent
• If a schedule S can be transformed into a schedule S´ by a series of
swaps of non-conflicting instructions, we say that S and S´ are conflict
equivalent.
• We say that a schedule S is conflict serializable if it is conflict
equivalent to a serial schedule.
• The schedule which is conflict serializable is always conflict equivalent
to one of the serial schedule.

32
Testing for Conflict Serializability
• Precedence Graph or Serialization Graph is used commonly to
test Conflict Serializability of a schedule.

• It is a directed Graph (V, E) consisting of a set of nodes


V = {T1, T2, T3……….Tn} and a set of directed edges
E = {e1, e2, e3………………em}.

• The graph contains one node for each Transaction Ti. An edge ei is
of the form Tj –> Tk where Tj is the starting node of ei and Tk is the
ending node of ei. An edge ei is constructed between nodes Tj to Tk if
one of the operations in Tj appears in the schedule before some
conflicting operation in Tk .
33
Testing for Conflict Serializability
• The Algorithm can be written as:
• Create a node T in the graph for each participating transaction in the
schedule.
• For the conflicting operation read_item(X) and write_item(X) – If a
Transaction Tj executes a read_item (X) after Ti executes a write_item (X),
draw an edge from Ti to Tj in the graph.
• For the conflicting operation write_item(X) and read_item(X) – If a
Transaction Tj executes a write_item (X) after Ti executes a read_item (X),
draw an edge from Ti to Tj in the graph.
• For the conflicting operation write_item(X) and write_item(X) – If a
Transaction Tj executes a write_item (X) after Ti executes a write_item (X),
draw an edge from Ti to Tj in the graph.
• The Schedule S is serializable if there is no cycle in the precedence graph.
34
Testing for Conflict Serializability
• If there is no cycle in the precedence graph, it means we can
construct a serial schedule S’ which is conflict equivalent to the
schedule S.
• The serial schedule S’ can be found by Topological Sorting of the
acyclic precedence graph. Such schedules can be more than 1.

35
Testing for Conflict Serializability
• For example, Consider the schedule S :
S : r1(x) r1(y) w2(x) w1(x) r2(y)
• Creating Precedence graph:
1. Make two nodes corresponding to Transaction T1 and T2.

36
Testing for Conflict Serializability
• For example, Consider the schedule S :
S : r1(x) r1(y) w2(x) w1(x) r2(y)
• Creating Precedence graph:
2. For the conflicting pair r1(x) w2(x), where r1(x) happens before
w2(x), draw an edge from T1 to T2.

37
Testing for Conflict Serializability
• For example, Consider the schedule S :
S : r1(x) r1(y) w2(x) w1(x) r2(y)
• Creating Precedence graph:
3. For the conflicting pair w2(x) w1(x), where w2(x) happens before
w1(x), draw an edge from T2to T1.

38
Testing for Conflict Serializability
• Since the graph is cyclic, we can conclude that it is not conflict
serializable to any schedule serial schedule.
• Let us try to infer a serial schedule from this graph using topological
ordering.
• The edge T1–>T2 tells that T1 should come before T2 in the linear
ordering.
• The edge T2 –> T1 tells that T2 should come before T1 in the linear
ordering.
• So, we can not predict any particular order (when the graph is cyclic).
Therefore, no serial schedule can be obtained from this graph.

39
Testing for Conflict Serializability
• S1: r1(x) r3(y) w1(x) w2(y) w3(x) w2(x)

• Since the graph is acyclic, the schedule is conflict serializable.


Performing Topological Sort on this graph would give us a possible
serial schedule which is conflict equivalent to schedule S1.
• In Topological Sort, we first select the node with in-degree 0, which is
T1. This would be followed by T3 and T2.
• So, S1 is conflict serializable since it is conflict equivalent to the serial
schedule T1 T3 T2.

40
Problems of Concurrency
• Several problems can occur when concurrent transactions execute in
an uncontrolled manner.
1. The Lost Update Problem
2. The Temporary Update (or Dirty Read) Problem
3. The Incorrect Summary Problem

41
Problems of Concurrency
1. The Lost Update Problem: This problem occurs when two transactions
that access the same database items have their operations interleaved in
a way that makes the value of some database items incorrect.
Assume,
X=80, N=5, M=4
Final result Should be = 79
But the Final Result is = 84

42
Problems of Concurrency
• Suppose that transactions T1 and T2 are submitted at approximately the
same time, and suppose that their operations are interleaved as shown.
• Then the final value of item X is incorrect, because T2 reads the value of X
before T1 changes it in the database, and hence the updated value
resulting from T1 is lost.
• For example, if X = 80 at the start (originally there were 80 reservations on
the flight), N =5 (T) transfers 5 seat reservations from the flight
corresponding to X to the flight corresponding to Y), and M = 4 (T2 reserves
4 seats on X), the final result should be X = 79; but in the interleaving of
operations, it is X = 84 because the update in T1 that removed the five
seats from X was lost.
43
Problems of Concurrency
• Another Example of Lost Update Problem:

44
Problems of Concurrency
2. The Temporary Update (or Dirty Read) Problem: This problem occurs
when one transaction updates a database item and then the transaction fails
for some reason. The updated item is accessed by another transaction
before it is changed back to its original value.

45
Problems of Concurrency
2. The Temporary Update (or Dirty Read) Problem:
The value of item X that is read by T2 is called dirty data, because it has been
created by a transaction that has not completed and committed yet; hence,
this problem is also known as the dirty read problem.

46
Problems of Concurrency
3. The Incorrect Summary Problem:
If one transaction is calculating an
aggregate summary function on a
number of records while other
transactions are updating some of
these records, the aggregate
function may calculate some values
before they are updated and others
after they are updated.

47
Problems of Concurrency
• Another problem that may occur is called unrepeatable read, where
a transaction T reads an item twice and the item is changed by
another transaction T' between the two reads. Hence, T receives
different values for its two reads of the same item.

48
Types of Failures
1. A computer failure (system crash): A hardware, software, or network
error occurs in the computer system during transaction execution.
2. A transaction or system error: Some operation in the transaction may
cause it to fail, such as integer overflow or division by zero. Transaction
failure may also occur because of erroneous parameter values or
because of a logical programming error.
3. Local errors or exception conditions detected by the transaction: During
transaction execution, certain conditions may occur that necessitate
cancellation of the transaction. For example, data for the transaction
may not be found. This exception should be programmed in the
transaction itself, and hence would not be considered a failure.
49
Types of Failures
4. Concurrency control enforcement: The concurrency control method
may decide to abort the transaction, to be restarted later, because it
violates serializability or because several transactions are in a state of
deadlock.
5. Disk failure: Some disk blocks may lose their data because of a read
or write malfunction or because of a disk read/write head crash. This
may happen during a read or a write operation of the transaction.
6. Physical problems and catastrophes: This refers to an endless list of
problems that includes power or air-conditioning failure, fire, theft,
sabotage, overwriting disks or tapes by mistake, and mounting of a
wrong tape by the operator.
50
Types of Schedules
1. Serial Schedule
2. Complete Schedule
3. Recoverable Schedule
4. Cascadeless Schedule
5. Strict Schedule

51
Types of Schedules T1 T2

R(A)
1. Serial Schedule
W(A)

Schedules in which the transactions are R(B)


executed non-interleaved, i.e., a serial schedule
is one in which no transaction starts until a W(B)
Running transaction has ended are called
serial schedules. R(A)

This is a serial schedule since the transactions


R(B)
perform
serially in the order T1 —> T2

52
Types of Schedules T1 T2 T3

R(A)

2. Complete Schedule
W(A)

R(B)
Schedules in which the last operation
of each transaction is either abort (or) W(B)

commit are called complete schedules.


commit

commit

abort

53
T1 T2
Types of Schedules
R(A)

3. Recoverable Schedule
W(A)

If some transaction Tj is reading W(A)

value updated or written by some other


R(A)
transaction Ti, then the commit of Tj must
occur after the commit of Ti. commit

commit

54
Types of Schedules
4. Cascadeless Schedule

• Schedules in which transactions read values only after all


transactions whose changes they are going to read commit are
called cascadeless schedules. Avoids that a single transaction abort
leads to a series of transaction rollbacks. A strategy to prevent
cascading aborts is to disallow a transaction from reading
uncommitted changes from another transaction in the same
schedule.
• In other words, if some transaction Tj wants to read value updated or
written by some other transaction Ti, then the commit of Tj must read
it after the commit of Ti.
55
T1 T2
Types of Schedules
R(A)

4. Cascadeless Schedule
W(A)

This schedule is cascadeless. W(A)

Since the updated value of A is commit


read by T2 only after the updating
R(A)
transaction i.e. T1 commits.
commit

Every Cascadeless schedule is also recoverable schedule.

56
T1 T2

Types of Schedules R(A)

R(A)
5. Strict Schedule
W(A)

• Tj can read or write updated or commit


written value of Ti only after
W(A)
Ti commits/aborts.
R(A)

• Strict schedules are recoverable and commit


cascadeless.

57
View Serializability
Two schedules S1 and S2 are said to be view equal iff following below
conditions are satisfied :
1) Initial Read
If a transaction T2 reading data item A from initial database in S1 then
in S2 also T2 should read A from initial database.

58
View Serializability
2) Updated Read
If Ti is reading A which is updated by Tj in S1 then in S2 also Ti should
read A which is updated by Tj.

Above two schedule are not view equal as in S1 :T3 is reading A


updated by T2, in S2 T3 is reading A updated by T1.
59
View Serializability
3) Final Write operation
If a transaction T1 updated A at last in S1, then in S2 also T1 should
perform final write operations.

Above two schedule are not view as Final write operation in S1 is done
by T1 while in S2 done by T2.
60
View Serializability
• View Serializability: A Schedule is called view serializable if it is view
equal to a serial schedule (no overlapping transactions).

• Every conflict serializable schedule is also view serializable, but not


vice-versa.

61
View Serializability
Let S and S´ be two schedules with the same set of transactions. S
and S´ are view equivalent if the following three conditions are met,
for each data item Q,

1. If in schedule S, transaction Ti reads the initial value of Q, then in


schedule S’ also transaction Ti must read the initial value of Q.

2. If in schedule S transaction Ti executes read(Q), and that value was


produced by transaction Tj (if any), then in schedule S’ also transaction Ti
must read the value of Q that was produced by the same write(Q)
operation of transaction Tj .

3. The transaction (if any) that performs the final write(Q) operation
in schedule S must also perform the final write(Q) operation in schedule
S’ 62
Test for View Serializability
• The problem of checking if a schedule is view serializable falls in the
class of NP-complete problems.

• Thus existence of an efficient algorithm is extremely unlikely.

• NP-complete: NP-complete problems are problems whose status is


unknown. No polynomial time algorithm has yet been discovered for
any NP-complete problem, nor has anybody yet been able to prove
that no polynomial-time algorithm exist for any of them.

63
Concurrency Control
• Concurrency control techniques are used to ensure that the Isolation (or
non-interference) property of concurrently executing transactions is
maintained.
• Concurrency-control protocols : allow concurrent schedules, but ensure
that the schedules are conflict/view serializable, and are recoverable and
maybe even cascadeless.
• These protocols do not examine the precedence graph as it is being
created, instead a protocol imposes a discipline that avoids non-seralizable
schedules.
• Different concurrency control protocols provide different advantages
between the amount of concurrency they allow and the amount of
overhead that they impose.
64
Purpose of Concurrency Control
• To enforce ISOLATION.
• To preserve database consistency.
• To resolve READ-WRITE and WRITE-WRITE conflicts.

65
Concurrency Protocol - Different categories of
protocols: ASSIGNMENTS
Lock Based Protocol
Basic 2-PL
Conservative 2-PL
Strict 2-PL
Rigorous 2-PL
Graph Based Protocol
Time-Stamp Ordering Protocol
Multiple Granularity Protocol
Validation-Based Protocol

66
Lock Based Protocols
• A lock is a variable associated with a data item that describes a status
of data item with respect to possible operation that can be applied to
it. They synchronize the access by concurrent transactions to the
database items. It is required in this protocol that all the data items
must be accessed in a mutually exclusive manner. Let me introduce
you to two common locks which are used and some terminology
followed in this protocol.

67
Lock Based Protocols
1.Shared Lock (S): also known as Read-only lock. As the name
suggests it can be shared between transactions because while
holding this lock the transaction does not have the permission
to update data on the data item. S-lock is requested using lock-
S instruction.

2.Exclusive Lock (X): Data item can be both read as well as


written. This is Exclusive and cannot be held simultaneously on
the same data item. X-lock is requested using lock-X
instruction.
68
Lock Compatibility Matrix

• A transaction may be granted a lock on an item if the requested


lock is compatible with locks already held on the item by other
transactions.
• Any number of transactions can hold shared locks on an item,
but if any transaction holds an exclusive(X) on the item no other
transaction may hold any lock on the item.
• If a lock cannot be granted, the requesting transaction is made
to wait till all incompatible locks held by other transactions have
been released. Then the lock is granted.

69
Lock Based Protocols
• Upgrade / Downgrade locks : A transaction that holds a lock on an
item A is allowed under certain condition to change the lock state
from one state to another.
• Upgrade: A S(A) can be upgraded to X(A) if Ti is the only transaction
holding the S-lock on element A.
• Downgrade: We may downgrade X(A) to S(A) when we feel that we
no longer want to write on data-item A. As we were holding X-lock on
A, we need not check any conditions.

70
Lock Based Protocols
• Applying simple locking, we may not always produce Serializable
results, it may lead to Deadlock Inconsistency.

71
T1 T2
1 lock-X(B)

Lock Based Protocols 2 read(B)


3 B:=B-50
4 write(B)
• Consider the Partial Schedule:
5 lock-S(A)
6 read(A)
7 lock-S(B)
8 lock-X(A)
9 …… ……

• Deadlock – consider the above execution phase. Now, T1 holds an


Exclusive lock over B, and T2 holds a Shared lock over A. Consider
Statement 7, T2 requests for lock on B, while in Statement 8 T1
requests lock on A. This as you may notice imposes a Deadlock as
none can proceed with their execution.
72
Lock Based Protocols

• Starvation – is also possible if concurrency control manager is


badly designed.
• For example: A transaction may be waiting for an X-lock on an
item, while a sequence of other transactions request and are
granted an S-lock on the same item. This may be avoided if the
concurrency control manager is properly designed.

73
Deadlock, Starvation in DBMS
• Deadlock: A system is in a deadlock state if there exists a set of
transactions such that every transaction in the set is waiting for
another transaction in the set. In a database, a deadlock is an
unwanted situation in which two or more transactions are waiting
indefinitely for one another to give up locks.

• Starvation: A transaction is starved if it cannot proceed for an


indefinite period of time while other transactions in the system
continue normally.

74
Deadlock in DBMS - Example
• Suppose, Transaction T1 holds a lock on some rows in the Students
table and needs to update some rows in the Grades table.
Simultaneously, Transaction T2 holds locks on those very rows
(Which T1 needs to update) in the Grades table but needs to update
the rows in the Student table held by Transaction T1.

• Now, the main problem arises. Transaction T1 will wait for


transaction T2 to give up lock, and similarly transaction T2 will wait
for transaction T1 to give up lock. As a consequence, All activity
comes to a halt and remains at a standstill forever unless the DBMS
detects the deadlock and aborts one of the transactions.
75
Deadlock in DBMS - Example

76
Deadlock in DBMS – Deadlock Avoidance
• When a database is stuck in a deadlock, It is always better to avoid
the deadlock rather than restarting or aborting the database.
• Deadlock avoidance method is suitable for smaller database whereas
deadlock prevention method is suitable for larger database.
• One method of avoiding deadlock is using application consistent logic.
In the above given example, Transactions that access Students and
Grades should always access the tables in the same order. In this way,
in the scenario described above, Transaction T1 simply waits for
transaction T2 to release the lock on Grades before it begins. When
transaction T2 releases the lock, Transaction T1 can proceed freely.

77
Deadlock in DBMS – Deadlock Avoidance
• Another method for avoiding deadlock is to apply both row level
locking mechanism and READ COMMITTED isolation level. However, It
does not guarantee to remove deadlocks completely.

Read Committed – This isolation level guarantees that any data read
is committed at the moment it is read. Thus it does not allows dirty
read. The transaction hold a read or write lock on the current row,
and thus prevent other rows from reading, updating or deleting it.

78
Deadlock in DBMS – Deadlock Detection
• When a transaction waits indefinitely to obtain a lock, The database
management system should detect whether the transaction is
involved in a deadlock or not.

• Wait-for-graph is one of the methods for detecting the deadlock


situation. This method is suitable for smaller database. In this method
a graph is drawn based on the transaction and their lock on the
resource. If the graph created has a closed loop or a cycle, then there
is a deadlock.

79
Deadlock in DBMS – Deadlock Detection

80
Deadlock in DBMS – Deadlock Detection
Wait-for-graph:
• When a transaction Ti requests for a lock on an item, say X, which is
held by some other transaction Tj, a directed edge is created from Ti
to Tj. If Tj releases item X, the edge between them is dropped and Ti
locks the data item.

81
Deadlock in DBMS – Deadlock Detection
Transactions Data Items Lock Mode
• For Example, T1 Q Shared
T2 P Exclusive
Q Exclusive
T3 Q Shared
T4 P Exclusive

T3
T1

T2 T4

82
Deadlock in DBMS – Deadlock Detection
• Steps:
1. Initially draw a vertice for T1 (Q) which is in Shared Mode.
2. Draw a vertice for T2 (P,Q) – Now P is in Exclusive mode, whereas Q is
also in Exclusive Mode. But T1 has already Locked Q in Shared Mode, so
check compatability matrix of Shared-Exclusive, therefore lock can’t be
granted and T2 will wait for a resource Q held by T1 and draw an edge
from T2 to T1.
3. Now T3 (Q) is in Shared Mode, it waits for the data item Q to release the
locks but in compatibility matrix, Shared-Shared operation is allowed so
there won’t be any dependency for T3.
4. Like in Step 2, T4 will wait for a resource P held by T2 and draw an edge
from T4 to T2.

83
Deadlock in DBMS – Deadlock Detection
• Suppose if there is another transaction of
T4, Q, Exclusive
happens then there will be an edge from T4 to T1 as T1 is having Lock
on Q (Shared) and T4 wants to lock Q (Exclusive), which is not possible
from the compatibility matrix hence T4 will be dependent on T1.

84
Deadlock in DBMS – Deadlock Prevention
Wait-Die Scheme –
In this scheme, If a transaction request for a resource that is locked by other
transaction, then the DBMS simply checks the timestamp of both transactions and
allows the older transaction to wait until the resource is available for execution.
Suppose, there are two transactions T1 and T2 and Let timestamp of any
transaction T be TS (T). Now, If there is a lock on T2 by some other transaction and
T1 is requesting for resources held by T2, then DBMS performs following actions:
Checks if TS (T1) < TS (T2) – if T1 is the older transaction and T2 has held some
resource, then it allows T1 to wait until resource is available for execution. If T1 is
older transaction and has held some resource with it and if T2 is waiting for it, then
T2 is killed and restarted latter with random delay but with the same timestamp.
i.e. if the older transaction has held some resource and younger transaction waits
for the resource, then younger transaction is killed and restarted with very minute
delay with same timestamp.
This scheme allows the older transaction to wait but kills the younger one.

85
Deadlock in DBMS – Deadlock Prevention
Wound Wait Scheme –
In this scheme, if an older transaction requests for a resource held by
younger transaction, then older transaction forces younger transaction
to kill the transaction and release the resource. The younger
transaction is restarted with minute delay but with same timestamp. If
the younger transaction is requesting a resource which is held by older
one, then younger transaction is asked to wait till older releases it.

86
Deadlock in DBMS – Deadlock Prevention
Time Out Based Scheme – (Based on Lock Time outs)

A transaction that has requested a lock waits for at most a specified


amount of time. If the lock is not granted within that time, transaction
is said to timeout and it rolls itself back and restarts.

For example, OTP submission while doing transaction.

• Deadlocks are not possible


• Simple to implement; but starvation is possible. Also difficult to
determine good value of the timeout interval.
87
Deadlock in DBMS – Deadlock Recovery
When deadlock is detected :

• Some transaction will have to rolled back (made a victim) to break


deadlock. Select that transaction as victim that will incur minimum
cost.

• Rollback determine how far to roll back transaction


• Total rollback: Abort the transaction and then restart it.
• More effective to roll back transaction only as far as necessary
to break deadlock.

• Starvation happens if same transaction is always chosen as


victim. Include the number of rollbacks in the cost factor to avoid
starvation
88
Deadlock in DBMS – Starvation
• Starvation is the situation when a transaction has to wait for a
indefinite period of time to acquire a lock.

• Reasons of Starvation –
1. If waiting scheme for locked items is unfair. ( priority queue )
2. Victim selection. ( same transaction is selected as a victim
repeatedly )

89
Deadlock in DBMS – Starvation Example
• Suppose there are 3 transactions namely T1, T2, and T3 in a database
that are trying to acquire a lock on data item ‘X’ . Now, suppose the
scheduler grants the lock to T1(may be due to some priority), and the
other two transactions are waiting for the lock. As soon as the
execution of T1 is over, another transaction T4 also come over and
request unlock on data item I. Now, this time the scheduler grants
lock to T4, and T2, T3 has to wait again . In this way if new
transactions keep on requesting the lock, T2 and T3 may have to wait
for an indefinite period of time, that leads to Starvation.

90
Deadlock in DBMS – Solutions To Starvation
• Increasing Priority – Starvation occurs when a transaction has to wait for an
indefinite time, In this situation we can increase the priority of that particular
transaction/s. But the drawback with this solution is that it may happen that the
other transaction may have to wait longer until the highest priority transaction
comes and proceeds.

• Modification in Victim Selection algorithm – If a transaction has been a victim of


repeated selections, then the algorithm can be modified by lowering its priority
over other transactions.

• First Come First Serve approach – A fair scheduling approach i.e FCFS can be
adopted, In which the transaction can acquire a lock on an Item in the order, in
which the requested the lock.

• Wait die and wound wait scheme – These are the schemes that uses timestamp
ordering mechanism of transaction .
91
Deadlock in DBMS – Lock Based Protocols
• Implementing this lock system without any restrictions gives us the
Simple Lock based protocol (or Binary Locking), but it has its own
disadvantages, they does not guarantee Serializability. Schedules may
follow the preceding rules but a non-serializable schedule may result.

• Must follow some additional protocol concerning the positioning of


locking and unlocking operations in every transaction. This is where
the concept of Two Phase Locking(2-PL) comes in the picture, 2-PL
ensures Serializability.

92
Deadlock in DBMS – Lock Based Protocols
• Two Phase Locking –
• A transaction is said to follow Two Phase Locking protocol if Locking
and Unlocking can be done in two phases.
• Growing Phase: New locks on data items may be acquired but none
can be released.
• Shrinking Phase: Existing locks may be released but no new locks can
be acquired.
• Note – If lock conversion is allowed, then upgrading of lock( from S(a)
to X(a) ) is allowed in Growing Phase and downgrading of lock (from
X(a) to S(a)) must be done in shrinking phase.

93
Crash Recovery – Failure Classification
• Transaction failure :
• Logical errors: transaction cannot complete due to some internal
error condition
• System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)

• System crash: a power failure or other hardware or software failure


causes the system to crash.
• Fail-stop assumption: non-volatile storage contents are assumed
to not be corrupted by system crash
• Database systems have numerous integrity checks to prevent
corruption of disk data

94
Crash Recovery – Failure Classification
• Disk failure: a head crash or similar disk failure destroys all or part of disk
storage
• Destruction is assumed to be detectable: disk drives use
checksums to detect failures

95
Crash Recovery – Storage Structure
• Volatile storage:
• Does not survive system crashes
• Examples: main memory, cache memory

• Non-volatile storage:
• Survives system crashes
• Examples: disk, tape, flash memory, non-volatile (battery backed up) RAM

• Stable storage:
• A mythical form of storage that survives all failures
• Information residing in stable storage is never lost
• Approximated by maintaining multiple copies on distinct non-volatile media
96
Crash Recovery – Data Access

• Physical blocks are those blocks residing on the disk.


• Buffer blocks are the blocks residing temporarily in main memory.
• Block movements between disk and main memory are initiated
through the following two operations:
• input(A) transfers the physical block A to main memory.
• output(B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.
97
Crash Recovery – Recovery
• An integral part of a database system is a recovery scheme that can
restore the database to the consistent state that existed before the
failure.

98
Crash Recovery – Log-Based Recovery
• The most widely used structure for recording database modifications
is the log. The log is a sequence of log records, recording all the
update activities in the database.
• There are several types of log records. An update log record describes
a single database write. It has these fields:
Transaction identifier is the unique identifier of the transaction that
performed the write operation.
Data-item identifier is the unique identifier of the data item written.
Typically, it is the location on disk of the data item.
Old value is the value of the data item prior to the write.
New value is the value that the data item will have after the write.
99
Crash Recovery – Log-Based Recovery
• A log is kept on stable storage.
• The log is a sequence of log records, and maintains a record of update
activities on the database.
• When transaction Ti starts, it registers itself by writing a <Ti start> log
record
• Before Ti executes write(X), a log record <Ti, X, V1, V2> is written,
where V1 is the value of X before the write, and V2 is the value to be
written to X.
• Log record notes that Ti has performed a write on data item Xj, Xj
had value V1 before the write, and will have value V2 after the write

100
Crash Recovery – Log-Based Recovery
• When Ti finishes it last statement, the log record <Ti commit> is
written.
• We assume for now that log records are written directly to stable
storage (that is, they are not buffered)
• Two approaches using logs
Deferred database modification
Immediate database modification

101
Crash Recovery – Log-Based Recovery
• When Ti finishes it last statement, the log record <Ti commit> is
written.
• We assume for now that log records are written directly to stable
storage (that is, they are not buffered)
• Two approaches using logs
Deferred database modification
Immediate database modification

102
Crash Recovery – Deferred Database
Modification
• The deferred-modification technique ensures transaction atomicity by
recording all database modifications in the log, but deferring the
execution of all write operations of a transaction until the transaction
partially commits.
• Recall that a transaction is said to be partially committed once the
final action of the transaction has been executed. The version of the
deferred-modification technique that we describe in this section
assumes that transactions are executed serially.

103
Crash Recovery – Deferred Database
Modification
• The deferred database modification scheme records all
modifications to the log, but defers all the writes to after partial
commit.
• Assume that transactions execute serially
• Transaction starts by writing <Ti start> record to log.
• A write(X) operation results in a log record <Ti, X, V> being written,
where V is the new value for X

• Note: old value is not needed for this scheme

104
Crash Recovery – Deferred Database
Modification
• The write is not performed on X at this time, but is deferred.
• When Ti partially commits, <Ti commit> is written to the log
• Finally, the log records are read and used to actually execute the
previously deferred writes.
• During recovery after a crash, a transaction needs to be redone if and
only if both <Ti start> and<Ti commit> are there in the log.
• Redoing a transaction Ti ( redo Ti) sets the value of all data items
updated by the transaction to the new values.

105
Crash Recovery – Deferred Database
Modification
• Crashes can occur while
The transaction is executing the original updates, or
While recovery action is being taken

For example,
• Value of A = 1000, B = 2000, C = 700
• Transactions T0 and T1 (T0 executes before T1)

106
Crash Recovery – Deferred Database
Modification

107
Crash Recovery – Deferred Database
Modification

• If log on stable storage at time of crash is as in case:


(a) No redo actions need to be taken
(b) redo(T0) must be performed since <T0 commit> is present
(c) redo(T0) must be performed followed by redo(T1) since
<T0 commit> and <Ti commit> are present
108
Crash Recovery – Immediate Database
Modification
• The immediate-modification technique allows database modifications
to be output to the database while the transaction is still in the active
state. Data modifications written by active transactions are called
uncommitted modifications.
• In the event of a crash or a transaction failure, the system must use
the old-value field of the log records to restore the modified data
items to the value they had prior to the start of the transaction. The
undo operation accomplishes this restoration.

109
Crash Recovery – Immediate Database
Modification
• The immediate database modification scheme allows database
updates of an uncommitted transaction to be made as the writes are
issued
● since undoing may be needed, update logs must have both old
value and new value
• Update log record must be written before database item is written
● We assume that the log record is output directly to stable storage

• Output of updated blocks can take place at any time before or after
transaction commit
• Order in which blocks are output can be different from the order in
which they are written.
110
Crash Recovery – Immediate Database
Modification

• Recovery procedure has two operations instead of one:

● undo(Ti) restores the value of all data items updated by Ti to their


old values, going backwards from the last log record for Ti

● redo(Ti) sets the value of all data items updated by Ti to the new
values, going forward from the first log record for Ti

111
Crash Recovery – Immediate Database
Modification
• When recovering after failure:
• Transaction Ti needs to be undone if the log contains the record
<Ti start>, but does not contain the record <Ti commit>.
• Transaction Ti needs to be redone if the log contains both the record
<Ti start> and the record <Ti commit>.
• Undo operations are performed first, then redo operations.

112
Crash Recovery – Immediate Database
Modification

Recovery actions in each case above are:


(a) undo (T0): B is restored to 2000 and A to 1000.
(b) undo (T1) and redo (T0): C is restored to 700, and then A and B are
set to 950 and 2050 respectively.
(c) redo (T0) and redo (T1): A and B are set to 950 and 2050
respectively. Then C is set to 600
113
Crash Recovery – Checkpoints
• Problems in recovery procedure as discussed earlier :
1. searching the entire log is time-consuming
2. we might unnecessarily redo transactions which have already
output their updates to the database.

• Streamline recovery procedure by periodically performing


checkpointing
1. Output all log records currently residing in main memory onto
stable storage.
2. Output all modified buffer blocks to the disk.
3. Write a log record < checkpoint> onto stable storage
114
Crash Recovery – Checkpoints
• Transactions are not allowed to perform any update actions, such as
writing to a buffer block or writing a log record, while a checkpoint is
in progress.

115
Crash Recovery – Checkpoints
• During recovery we need to consider only the most recent transaction Ti
that started before the checkpoint, and transactions that startedafter Ti.
1. Scan backwards from end of log to find the most recent
<checkpoint> record
2. Continue scanning backwards till a record <Ti start> is found.
3. Need only consider the part of log following above start record. Earlier
part of log can be ignored during recovery, and can be erased whenever
desired.
4. For all transactions (starting from Ti or later) with no <Ti commit>,
execute undo(Ti). (Done only in case of immediate modification.)
5. Scanning forward in the log, for all transactions starting
from Ti or later with a <Ti commit>, execute redo(Ti).
116
Crash Recovery – Shadow Paging
• Requires few disk access than do-log methods.
• Maintain two page tables during the life cycle of Transaction.
• When transaction starts both page tables are identical.
• Shadow page table is never changed over duration of Transaction.
• Current page table may changed during write operation.
• All input and output operations use the current page table to locate
database pages on disk.
• Store shadow page table in non-volatile storage.

117
Crash Recovery – Shadow Paging
• Note: When transaction commits system writes current page
table to non-volatile storage. The current page table then
becomes new shadow page table.

• Note: When system comes back for further transaction then it


copies the shadow page table in main memory and uses it as
current page table.

118
Crash Recovery – Shadow Paging
• Advantages of Shadow paging over log-based techniques:
• Log record overhead is removed.
• Faster recovery (No UNDO-REDO operations)
• Drawbacks of Shadow paging:
• Commit overhead - (Actual data blocks, current page table, disk
address of current page table).
• Data fragmentation – Locality property is lost. Shadow paging
causes database pages to change location.
• Garbage Collection – When transaction commits, database pages
containing old version of data changed by transaction become
inaccessible.
119
Challenges of database security
• 1. Data quality –
• The database community basically needs techniques and some
organizational solutions to assess the quality of data. These techniques
may include the simple mechanism such as quality stamps that are posted
on different websites. We also need techniques that will provide us more
effective integrity semantics verification tools for assessment of data
quality, based on many techniques such as record linkage.
• We also need application-level recovery techniques to automatically repair
the incorrect data.
• The ETL (Extract, Transform, Load) that is extracted transform and load
tools widely used for loading the data in the data warehouse are presently
grappling with these issues.
120
Challenges of database security
• 2. Intellectual property rights –
• As the use of Internet and intranet is increasing day by day, legal and
informational aspects of data are becoming major concerns for many
organizations. To address this concerns watermark technique are used
which will help to protect content from unauthorized duplication and
distribution by giving the provable power to the ownership of the content.
• Traditionally they are dependent upon the availability of a large domain
within which the objects can be altered while retaining its essential or
important properties.
• However, research is needed to access the robustness of many such
techniques and the study and investigate many different approaches or
methods that aimed to prevent intellectual property rights violation.
121
Challenges of database security
• 3. Database survivability –
• Database systems need to operate and continued their functions even with
the reduced capabilities, despite disruptive events such as information
warfare attacks
• A DBMS in addition to making every effort to prevent an attack and
detecting one in the event of the occurrence should be able to do the
following:
• Confident:
• We should take immediate action to eliminate the attacker’s access to the
system and to isolate or contain the problem to prevent further spread.
• Damage assessment:
• Determine the extent of the problem, including failed function and
corrupted data.

122
Challenges of database security
• 3. Database survivability –
• Recover:
• Recover corrupted or lost data and repair or reinstall failed function
to re-establish a normal level of operation.
• Reconfiguration:
• Reconfigure to allow the operation to continue in a degraded mode
while recovery proceeds.
• Fault treatment:
• To the extent possible, identify the weakness exploited in the attack
and takes steps to prevent a recurrence.
123

You might also like