Unit 4
Unit 4
Unit 4
TRANSACTIONS
AND
CONCURRENCY MANAGEMENT
Transactions - Concurrent Transactions - Locking Protocol - Serialisable Schedules - Locks Two Phase
Database Recovery and Security: Database Recovery meaning - Kinds of failures - Failure controlling
methods - Database errors - Backup & Recovery Techniques - Security & Integrity - Database Security -
Authorization.
1
Transactions
A Transaction is an execution of a program and is seen by DBMS as a series or list of actions. It is different
from an ordinary program and is the result from the execution of a program written in a high – level data
manipulation language or programming language. A transaction starts and ends between the statements
“begin transaction” and “end transaction”.
In a transaction, access to the database is knowledgeable by two operaions.
i Read(x).
ii Write(x).
The first one perform the reading operation of data item x from the database, where as the second one
perform the writing operation of data item x to the database. Consider a transaction Ti which transfers
100/- from “A” account to “B” account. This transaction will follows
Ti :
read(A);
A: = A-100;
Write(A);
Read(B);
Write(B);
2
hold the modified data. If a transaction commits but the system fails before the data could be written on to
the disk, then that data will be updated once the system springs back into action.
Transaction States
1. Active State This is the first state in the life cycle of a transaction. A transaction is called in an
active state as long as its instructions are getting executed. All the changes made by the transaction now are
stored in the buffer in main memory.
2. Partially Committed State After the last instruction of transaction has executed, it enters into a
partially committed state. After entering this state, the transaction is considered to be partially committed.
It is not considered fully committed because all the changes made by the transaction are still stored in the
buffer in main memory.
3. Committed State After all the changes made by the transaction have been successfully stored into the
database, it enters into a committed state. Now, the transaction is considered to be fully committed.
4. Failed State When a transaction is getting executed in the active state or partially committed state and
some failure occurs due to which it becomes impossible to continue the execution, it enters into a failed
state.
5. Aborted State After the transaction has failed and entered into a failed state, all the changes made by
it have to be undone. To undo the changes made by the transaction, it becomes necessary to roll back the
transaction. After the transaction has rolled back completely, it enters into an aborted state.
3
6. Terminated State This is the last state in the life cycle of a transaction. After entering the committed
state or aborted state, the transaction finally enters into a terminated state where its life cycle finally comes
to an end.
Concurrency Control
The coordination of the simultaneous execution of transactions in a multi user database system is
known as concurrency control.
Need for Concurrency Control
The objective of concurrency control is to ensure the serializability of transaction in a multi-user database
environment. Concurrency control is important because the simultaneous execution of transactions over a
shared database can several data integrity and consistency problems. The three main problems are
a) Lost updates b) Uncommitted data c) Inconsistent data
a) Lost updates:-The Lost Updates problem occurs when two concurrent transactions T1 and T2 are
updating the same data element and one of the updates is lost.
b) Uncommitted data:-Uncommitted data occurs when two transactions T1 and T2 are executed
concurrently and the first transaction (T1) is rolled back after the second transaction (T2) has already
accessed the uncommitted data. Thus it is violating Isolation property of transactions.
c) Inconsistent retrievals: Inconsistent retrievals occur when a transaction access data before and after
another transactions finish working with such data.
Serializability
Serializability ensures that all transactions are executed one after another, in a non-preemptive manner.
In Serial schedule, there is no question of sharing a single data item among many transactions,
because not more than a single transaction is executing at any point of time. However, a serial schedule is
inefficient in the sense that the transactions suffer for having a longer waiting time and response time, as
well as low amount of resource utilization. Let us consider there are two transactions T1 and T2,We have
two accounts A and B, each containing Rs 1000/-.
T1: We now start a transaction to deposit Rs 100/- from account A to Account B.
T2 : is a new transaction which deposits Rs.50 from account A to account B
4
If we prepare a serial schedule, then either T1 will completely finish before T2 can begin, or T2 will
completely finish before T1 can begin. Suppose, in schedule-1 the transactions are executed serially in order
-2 the transactions
In above schedules 1 & 2 transactions are executed serially, the final amount in account A is Rs. 850 & B
Account is Rs. 1150. After the execution, total amount is calculated (A+B) to ensure consistency. The sum
of A+B is Rs. 2000. Since they generate consistence result.
5
In above two schedules-3 and 4 transactions are executed concurrently, the final amount in account A is Rs.
850 & B Account is Rs. 1150. The sum of A+B is Rs. 2000. Since they generate consistence result. But let
us take another example where a wrong concurrent schedule can bring about disaster. Consider the
following example involving the same T1 and T2
This schedule - S5 results inconsistent state, because we have made the switching at the second instruction
of T1. The result is very confusing. If we consider accounts A and B both containing Rs 1000/- each, then
the result of this schedule should have left Rs 900/- in A, Rs 1150/- in B .
The sum of A+B is Rs. 2050. It is inconsistent sate.
6
table may be locked. Table level locks are less restrictive than database level locks. Table level locks are not
suitable for multi-user DBMS. The drawback of table level lock is suppose transaction T1 and T2 cannot
access the same table even when they try to use different rows; T2 must wait until T1 unlocks the table.
c) Page level:- In a page level lock, the DBMS will lock on entire disk page. A disk page or page is also
referred as a disk block, which is described as a section of a disk. A page has a fixed size such as 4k, 8k or
16k. A table can span several pages, and a page can contain several rows of one or more tables. Page level
locks are currently frequently used multi-user DBMS locking method. Page level lock is shown in the
following fig.
In the above fig. T1 and T2 access the same table while locking different disk pages. If T2 requires the use
of a row located on a page that is locked by T1, T2 must wait until the page is unlocked.
d) Row level:- A row level lock is much less restrictive than the other locks. The DBMS allows
concurrent transactions to access different rows of the same table even the rows are located on the same
pages. The row level locking approach improves the availability of data. But row level locking management
requires high overhead because a lock exist for each row in a table of the database. So it involves a
conflicting transaction.
e) Field level:- The field level lock allows concurrent transactions to access the same row as long as they
require the use of different fields (attributes) within that row.
Locking Methods
A lock is a mechanism to control concurrent access to a data item.
Lock : In an object is locked by a transaction no other transaction can use that object.
The object may be a database, table, page or row.
Unlock : If an object is unlocked, any trans action can lock the object for its use.
Data items can be locked in two modes:
1. Exclusive (X) mode. Data item can be both read as well as written. X-lock is requested using lock-X
instruction.
2. Shared (S) mode. Data item can only be read. S-lock is requested using lock-S instruction.
Lock requests are made to the concurrency-control manager by the programmer. Transaction can proceed
only after request is granted.
7
Lock requests are made to the concurrency-control manager by the programmer. Transaction can proceed
only after request is granted.
Deadlock
A deadlock is a condition where two or more transactions are waiting indefinitely for one another to
give up locks. Deadlock is said to be one of the most feared complications in DBMS as no task ever gets
finished and is in waiting state forever.
Or
A dead lock occurs when two transactions wait indefinitely for each other to unlock data For
example a dead lock occurs when two transactions, T1 and T2 exist in the following mode.
T1: access data items X and Y.
T2: access data items Y and X.
➢ T1 and T2 transactions are executing simultaneously so, T1 has locked data item X and T2 has locked
data item Y. Now transaction T1 is waiting to lock data item Y but it is already locked by T2. Simultaneous
T2 is waiting to lock X but it is already locked by T1. So both transactions are waiting to access other items.
Thus condition is referred as “Dead Lock”.
8
➢ So in real world DBMS, many transactions can be executed simultaneously, there by increasing the
probability of generating dead Locks.
9
A transaction in the Two Phase Locking Protocol can assume one of the 2 phases:
Two Phase Locking Protocol ensures conflict-serializable schedules.
Phase 1: Growing Phase
Transaction may obtain locks , but not release locks
Phase 2: Shrinking Phase
Transaction may release locks, but may not obtain locks
The protocol assures serializability. It can be proved that the transactions can be serialized in the order of
their lock points (i.e., the point where a transaction acquired its final lock).
Let’s discuss an example now. See how the schedule below follows Conservative 2-PL but does not follow
Strict and Rigorous 2-PL.
Look at the schedule, it completely follows Conservative 2-PL, but fails to meet the requirements of Strict
and Rigorous 2-PL, that is because we unlock A and B before the transaction commits.
2. Timestamp based Concurrency Control
A timestamp is a tag that can be attached to any transaction or any data item, which denotes a specific time
on which the transaction or the data item had been used in any way. A timestamp can be implemented in 2
ways.
One is to directly assign the current value of the clock to the transaction or data item.
The other is to attach the value of a logical counter that keeps increment as new timestamps are
required.
10
The timestamp of a data item can be of 2 types:
(i) W-timestamp(X): This means the latest time when the data item X has been written into.
(ii) R-timestamp(X): This means the latest time when the data item X has been read from. These 2
timestamps are updated each time a successful read/write operation is performed on the data item X.
Database Errors
A database represents an essential corporate resource that should be properly secured using appropriate
controls. We consider database security in relation to the following situations:
Loss of availability.
11
Loss of integrity.
Loss of confidentiality (secrecy).
Theft and fraud.
Any situation or event, whether intentionally or incidentally, can cause damage, which can reflect an adverse
effect on the database structure and, consequently, the organization.
Availability loss − Availability loss refers to non-availability of database objects by legitimate users.
Integrity loss − Integrity loss occurs when unacceptable operations are performed upon the database
either accidentally or maliciously. This may happen while creating, inserting, updating or deleting
data. It results in corrupted data leading to incorrect decisions.
Confidentiality loss − Confidentiality loss occurs due to unauthorized or unintentional disclosure of
confidential information. It may result in illegal actions, security threats and loss in public
confidence.
Theft and fraud - A threat may occur by a situation or event involving a person or the action or
situations that are probably to bring harm to an organization and its database. Some common types of
threats include following.
o People : In this, an attempt is made either intentionally or unintentionally to cause harm or
damage to the database environment fully or partly. For instance, applications, operating
systems, DBMS, networks, can be damaged by employee, government authorities,
consultants, visitors, hackers, terrorists, hackers, intruders etc.
o Natural Disasters such as earthquakes and floods can cause harm to the database
environment components fully or to a single component i.e ., partially.
o Malicious code : A software code becomes malicious if it contains corrupted code which is
included in order to harm or damage database management components. Malicious code can
either be in the form of Bugs, Macro code, Boot sector Viruses, Worms, Trojan Horses,
Denial of Service, spoofing, Email Spams.
o Technological disaster : This type of disaster occur due to the malfunction of equipment’s
and devices which cause damage to operating systems, networks, data files. Some of the
technologies disasters are power failures, media failure, network failure, hardware failure.
Database Security
Database security is the protection of the Database against intentional or unintentional threats using
computer-based or non-computer-based controls.
Authentication
Authentication is about validating your credentials like User Name/User ID and password to verify user’s
identity. The system determines whether you are what you say you are using your credentials. It includes
12
Username and Password
Plastic ID Card
Encryption
Data Encryption − Data encryption refers to coding data when sensitive data is to be communicated
over public channels. Even if an unauthorized agent gains access of the data, he cannot understand it since it
is in an incomprehensible format.
Cryptography is the science of encoding information before sending via unreliable communication
paths so that only an authorized receiver can decode and use it.
The coded message is called cipher text and the original message is called plain text.
Encryption The process of converting plain text to cipher text by the sender is called encoding or
encryption.
Decryption The process of converting cipher text to plain text by the receiver is called decoding or
decryption.
Authorization
Authorization is the culmination of the administrative policies of the organization. As name specifies,
authorization is a set of rules that can be used to determine which user has what type of access of which
portion of the database. The person who writes access rules is called an authorizer.
An authorizer may set several forms of authorization on parts of the database. Among them are the
following:
1. Read Authorization: allows reading, but not modification of data.
2. Insert Authorization: allows insertion of new data, but not the modification of existing data, e.g.
insertion of tuple in a relation.
3. Update authorization: allows modification of data, but not its deletion. But data items like primary-key
attributes may not be modified.
13