Unit 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

UNIT-IV

TRANSACTIONS
AND
CONCURRENCY MANAGEMENT

Transactions - Concurrent Transactions - Locking Protocol - Serialisable Schedules - Locks Two Phase

Locking (2PL) - Deadlock and its Prevention - Optimistic Concurrency Control.

Database Recovery and Security: Database Recovery meaning - Kinds of failures - Failure controlling

methods - Database errors - Backup & Recovery Techniques - Security & Integrity - Database Security -

Authorization.

1
Transactions
A Transaction is an execution of a program and is seen by DBMS as a series or list of actions. It is different
from an ordinary program and is the result from the execution of a program written in a high – level data
manipulation language or programming language. A transaction starts and ends between the statements
“begin transaction” and “end transaction”.
In a transaction, access to the database is knowledgeable by two operaions.
i Read(x).

ii Write(x).
The first one perform the reading operation of data item x from the database, where as the second one
perform the writing operation of data item x to the database. Consider a transaction Ti which transfers
100/- from “A” account to “B” account. This transaction will follows
Ti :
read(A);
A: = A-100;
Write(A);
Read(B);
Write(B);

ACID Properties or Transaction Properties


A transaction is a very small unit of a program and it may contain several lowlevel tasks. A transaction in a
database system must maintain Atomicity, Consistency, Isolation, and Durability − commonly known as
ACID properties − in order to ensure accuracy, completeness, and data integrity.
1. Atomicity − This property states that a transaction must be treated as an atomic unit, that is, either all of
its operations are executed or none. There must be no state in a database where a transaction is left partially
completed. States should be defined either before the execution of the transaction or after the
execution/abortion/failure of the transaction.
2. Consistency − The database must remain in a consistent state after any transaction. No transaction
should have any adverse effect on the data residing in the database. If the database was in a consistent state
before the execution of a transaction, it must remain consistent after the execution of the transaction as well.
3. Isolation − In a database system where more than one transaction are being executed simultaneously
and in parallel, the property of isolation states that all the transactions will be carried out and executed as if
it is the only transaction in the system. No transaction will affect the existence of any other transaction.
4. Durability − The database should be durable enough to hold all its latest updates even if the system
fails or restarts. If a transaction updates a chunk of data in a database and commits, then the database will

2
hold the modified data. If a transaction commits but the system fails before the data could be written on to
the disk, then that data will be updated once the system springs back into action.

Transaction States

1. Active State This is the first state in the life cycle of a transaction. A transaction is called in an
active state as long as its instructions are getting executed. All the changes made by the transaction now are
stored in the buffer in main memory.
2. Partially Committed State After the last instruction of transaction has executed, it enters into a
partially committed state. After entering this state, the transaction is considered to be partially committed.
It is not considered fully committed because all the changes made by the transaction are still stored in the
buffer in main memory.
3. Committed State After all the changes made by the transaction have been successfully stored into the
database, it enters into a committed state. Now, the transaction is considered to be fully committed.
4. Failed State When a transaction is getting executed in the active state or partially committed state and
some failure occurs due to which it becomes impossible to continue the execution, it enters into a failed
state.
5. Aborted State After the transaction has failed and entered into a failed state, all the changes made by
it have to be undone. To undo the changes made by the transaction, it becomes necessary to roll back the
transaction. After the transaction has rolled back completely, it enters into an aborted state.

3
6. Terminated State This is the last state in the life cycle of a transaction. After entering the committed
state or aborted state, the transaction finally enters into a terminated state where its life cycle finally comes
to an end.

Concurrency Control
The coordination of the simultaneous execution of transactions in a multi user database system is
known as concurrency control.
Need for Concurrency Control
The objective of concurrency control is to ensure the serializability of transaction in a multi-user database
environment. Concurrency control is important because the simultaneous execution of transactions over a
shared database can several data integrity and consistency problems. The three main problems are
a) Lost updates b) Uncommitted data c) Inconsistent data
a) Lost updates:-The Lost Updates problem occurs when two concurrent transactions T1 and T2 are
updating the same data element and one of the updates is lost.
b) Uncommitted data:-Uncommitted data occurs when two transactions T1 and T2 are executed
concurrently and the first transaction (T1) is rolled back after the second transaction (T2) has already
accessed the uncommitted data. Thus it is violating Isolation property of transactions.
c) Inconsistent retrievals: Inconsistent retrievals occur when a transaction access data before and after
another transactions finish working with such data.
Serializability
Serializability ensures that all transactions are executed one after another, in a non-preemptive manner.
In Serial schedule, there is no question of sharing a single data item among many transactions,
because not more than a single transaction is executing at any point of time. However, a serial schedule is
inefficient in the sense that the transactions suffer for having a longer waiting time and response time, as
well as low amount of resource utilization. Let us consider there are two transactions T1 and T2,We have
two accounts A and B, each containing Rs 1000/-.
 T1: We now start a transaction to deposit Rs 100/- from account A to Account B.
 T2 : is a new transaction which deposits Rs.50 from account A to account B

4
If we prepare a serial schedule, then either T1 will completely finish before T2 can begin, or T2 will
completely finish before T1 can begin. Suppose, in schedule-1 the transactions are executed serially in order
-2 the transactions

In above schedules 1 & 2 transactions are executed serially, the final amount in account A is Rs. 850 & B
Account is Rs. 1150. After the execution, total amount is calculated (A+B) to ensure consistency. The sum
of A+B is Rs. 2000. Since they generate consistence result.

Need for Concurrent Execution


In concurrent schedule, CPU time is shared among two or more transactions in order to run them
concurrently. However, this creates the possibility that more than one transaction may need to access a
single data item for read/write purpose and the database could contain inconsistent value if such accesses are
not handled properly. Let us explain with the help of an example. Suppose transactions are executed
concurrently as shown below.

5
In above two schedules-3 and 4 transactions are executed concurrently, the final amount in account A is Rs.
850 & B Account is Rs. 1150. The sum of A+B is Rs. 2000. Since they generate consistence result. But let
us take another example where a wrong concurrent schedule can bring about disaster. Consider the
following example involving the same T1 and T2

This schedule - S5 results inconsistent state, because we have made the switching at the second instruction
of T1. The result is very confusing. If we consider accounts A and B both containing Rs 1000/- each, then
the result of this schedule should have left Rs 900/- in A, Rs 1150/- in B .
The sum of A+B is Rs. 2050. It is inconsistent sate.

Lock Granularity (or) Locking Level


Lock granularity indicates the level of lock use. Locking can take place at the following level.
a) Database level b) Table level c) Page level d) Row level e) Field (attribute level)
a) Database level:- In database level lock the entire database is locked. So if transaction T1 is accessing
that database, then transaction T2 cannot access it. This level of locking is good for batch processes but it is
unsuitable for multi user DBMS. Because thousands of transactions had to wait for the previous transaction
to be completed before the next one could reserve the entire database. So the data access would be slow.
b) Table level:- In table level lock the entire table is locked that means if transaction T1 is accessing a
table then transaction T2 cannot access the same table. If a transaction requires access to several tables, each

6
table may be locked. Table level locks are less restrictive than database level locks. Table level locks are not
suitable for multi-user DBMS. The drawback of table level lock is suppose transaction T1 and T2 cannot
access the same table even when they try to use different rows; T2 must wait until T1 unlocks the table.
c) Page level:- In a page level lock, the DBMS will lock on entire disk page. A disk page or page is also
referred as a disk block, which is described as a section of a disk. A page has a fixed size such as 4k, 8k or
16k. A table can span several pages, and a page can contain several rows of one or more tables. Page level
locks are currently frequently used multi-user DBMS locking method. Page level lock is shown in the
following fig.

In the above fig. T1 and T2 access the same table while locking different disk pages. If T2 requires the use
of a row located on a page that is locked by T1, T2 must wait until the page is unlocked.
d) Row level:- A row level lock is much less restrictive than the other locks. The DBMS allows
concurrent transactions to access different rows of the same table even the rows are located on the same
pages. The row level locking approach improves the availability of data. But row level locking management
requires high overhead because a lock exist for each row in a table of the database. So it involves a
conflicting transaction.
e) Field level:- The field level lock allows concurrent transactions to access the same row as long as they
require the use of different fields (attributes) within that row.

Locking Methods
A lock is a mechanism to control concurrent access to a data item.
 Lock : In an object is locked by a transaction no other transaction can use that object.
The object may be a database, table, page or row.
 Unlock : If an object is unlocked, any trans action can lock the object for its use.
Data items can be locked in two modes:
1. Exclusive (X) mode. Data item can be both read as well as written. X-lock is requested using lock-X
instruction.
2. Shared (S) mode. Data item can only be read. S-lock is requested using lock-S instruction.
Lock requests are made to the concurrency-control manager by the programmer. Transaction can proceed
only after request is granted.

7
Lock requests are made to the concurrency-control manager by the programmer. Transaction can proceed
only after request is granted.

Shared/Exclusive locks can lead to two major problems:


 The resulting transaction schedule might not be serializable.
 The schedule might create deadlock.

Deadlock
A deadlock is a condition where two or more transactions are waiting indefinitely for one another to
give up locks. Deadlock is said to be one of the most feared complications in DBMS as no task ever gets
finished and is in waiting state forever.

Or
A dead lock occurs when two transactions wait indefinitely for each other to unlock data For
example a dead lock occurs when two transactions, T1 and T2 exist in the following mode.
T1: access data items X and Y.
T2: access data items Y and X.
➢ T1 and T2 transactions are executing simultaneously so, T1 has locked data item X and T2 has locked
data item Y. Now transaction T1 is waiting to lock data item Y but it is already locked by T2. Simultaneous
T2 is waiting to lock X but it is already locked by T1. So both transactions are waiting to access other items.
Thus condition is referred as “Dead Lock”.
8
➢ So in real world DBMS, many transactions can be executed simultaneously, there by increasing the
probability of generating dead Locks.

➢ The 3 basic techniques to control dead locks are


a) Dead Lock Prevention:-
➢ A transaction requesting a new lock is aborted when there is the possibility that a dead lock can occur. If
the transaction is aborted, all changes made by this transaction are rolled back, and all locks obtained by the
transaction are released.
➢ This method is used when there is existing high probability of dead lock.
b) Dead Lock Detection:-
➢ The DBMS tests the database for dead locks. If a dead lock is found, one of the transactions is aborted
and the other transaction continues.
➢ This method is used when there is un probability of dead locks.
c) Dead Lock Avoidance:-
➢ The transaction must obtain all of the locks if needs before it can be executed. This technique avoids the
rollback of conflicting transactions.

Concurrency Control Techniques


Concurrency control is provided in a database to:
(i) Enforce isolation among transactions.
(ii) Preserve database consistency through consistency preserving execution of transactions.
(iii) Resolve read-write and write-read conflicts.
Various concurrency control techniques are:
1. Two-phase locking Protocol
2. Time stamp ordering Protocol

1. Two-Phase Locking Protocol: Locking is an operation which secures: permission to


read, OR permission to write a data item. Two phase locking is a process used to gain ownership of shared
resources without creating the possibility of deadlock. The 3 activities taking place in the two phase update
algorithm are:
(i). Lock Gaining (ii). Modification of Data (iii). Release Lock
Two phase locking stops deadlock from occurring in distributed systems by releasing all the resources it has
acquired, if it is not possible to acquire all the resources required without waiting for another process to
finish using a lock. This means that no process is ever in a state where it is holding some shared resources,
and waiting for another process to release a shared resource which it requires. This means that deadlock
cannot occur due to resource contention.

9
A transaction in the Two Phase Locking Protocol can assume one of the 2 phases:
Two Phase Locking Protocol ensures conflict-serializable schedules.
Phase 1: Growing Phase
 Transaction may obtain locks , but not release locks
Phase 2: Shrinking Phase
 Transaction may release locks, but may not obtain locks
The protocol assures serializability. It can be proved that the transactions can be serialized in the order of
their lock points (i.e., the point where a transaction acquired its final lock).
Let’s discuss an example now. See how the schedule below follows Conservative 2-PL but does not follow
Strict and Rigorous 2-PL.

Look at the schedule, it completely follows Conservative 2-PL, but fails to meet the requirements of Strict
and Rigorous 2-PL, that is because we unlock A and B before the transaction commits.
2. Timestamp based Concurrency Control
A timestamp is a tag that can be attached to any transaction or any data item, which denotes a specific time
on which the transaction or the data item had been used in any way. A timestamp can be implemented in 2
ways.
 One is to directly assign the current value of the clock to the transaction or data item.
 The other is to attach the value of a logical counter that keeps increment as new timestamps are
required.
10
The timestamp of a data item can be of 2 types:
(i) W-timestamp(X): This means the latest time when the data item X has been written into.
(ii) R-timestamp(X): This means the latest time when the data item X has been read from. These 2
timestamps are updated each time a successful read/write operation is performed on the data item X.

Source and types of Database failures


1. Crash Recovery
DBMS is a highly complex system with hundreds of transactions being executed every second. The
durability and robustness of a DBMS depends on its complex architecture and its underlying hardware and
system software. If it fails or crashes amid transactions, it is expected that the system would follow some
sort of algorithm or techniques to recover lost data.
2. Failure Classification
To see where the problem has occurred, we generalize a failure into various categories, as follows −
3. Transaction failure
A transaction has to abort when it fails to execute or when it reaches a point from where it can’t go any
further. This is called transaction failure where only a few transactions or processes are hurt.
Reasons for a transaction failure could be −
 Logical errors − Where a transaction cannot complete because it has some code error or any internal
error condition.
 System errors − Where the database system itself terminates an active transaction because the
DBMS is not able to execute it, or it has to stop because of some system condition. For example, in
case of deadlock or resource unavailability, the system aborts an active transaction.
4. System Crash
There are problems − external to the system − that may cause the system to stop abruptly and cause the
system to crash. For example, interruptions in power supply may cause the failure of underlying hardware or
software failure. Examples may include operating system errors.
5. Disk Failure
In early days of technology evolution, it was a common problem where hard-disk drives or storage drives
used to fail frequently. Disk failures include formation of bad sectors, unreachability to the disk, disk head
crash or any other failure, which destroys all or a part of disk storage.

Database Errors
A database represents an essential corporate resource that should be properly secured using appropriate
controls. We consider database security in relation to the following situations:
 Loss of availability.

11
 Loss of integrity.
 Loss of confidentiality (secrecy).
 Theft and fraud.
Any situation or event, whether intentionally or incidentally, can cause damage, which can reflect an adverse
effect on the database structure and, consequently, the organization.

 Availability loss − Availability loss refers to non-availability of database objects by legitimate users.
 Integrity loss − Integrity loss occurs when unacceptable operations are performed upon the database
either accidentally or maliciously. This may happen while creating, inserting, updating or deleting
data. It results in corrupted data leading to incorrect decisions.
 Confidentiality loss − Confidentiality loss occurs due to unauthorized or unintentional disclosure of
confidential information. It may result in illegal actions, security threats and loss in public
confidence.
 Theft and fraud - A threat may occur by a situation or event involving a person or the action or
situations that are probably to bring harm to an organization and its database. Some common types of
threats include following.
o People : In this, an attempt is made either intentionally or unintentionally to cause harm or
damage to the database environment fully or partly. For instance, applications, operating
systems, DBMS, networks, can be damaged by employee, government authorities,
consultants, visitors, hackers, terrorists, hackers, intruders etc.
o Natural Disasters such as earthquakes and floods can cause harm to the database
environment components fully or to a single component i.e ., partially.
o Malicious code : A software code becomes malicious if it contains corrupted code which is
included in order to harm or damage database management components. Malicious code can
either be in the form of Bugs, Macro code, Boot sector Viruses, Worms, Trojan Horses,
Denial of Service, spoofing, Email Spams.
o Technological disaster : This type of disaster occur due to the malfunction of equipment’s
and devices which cause damage to operating systems, networks, data files. Some of the
technologies disasters are power failures, media failure, network failure, hardware failure.

Database Security
Database security is the protection of the Database against intentional or unintentional threats using
computer-based or non-computer-based controls.

Authentication
Authentication is about validating your credentials like User Name/User ID and password to verify user’s
identity. The system determines whether you are what you say you are using your credentials. It includes
12
 Username and Password
 Plastic ID Card
 Encryption
Data Encryption − Data encryption refers to coding data when sensitive data is to be communicated
over public channels. Even if an unauthorized agent gains access of the data, he cannot understand it since it
is in an incomprehensible format.
 Cryptography is the science of encoding information before sending via unreliable communication
paths so that only an authorized receiver can decode and use it.
 The coded message is called cipher text and the original message is called plain text.
 Encryption The process of converting plain text to cipher text by the sender is called encoding or
encryption.
 Decryption The process of converting cipher text to plain text by the receiver is called decoding or
decryption.

Authorization
Authorization is the culmination of the administrative policies of the organization. As name specifies,
authorization is a set of rules that can be used to determine which user has what type of access of which
portion of the database. The person who writes access rules is called an authorizer.
An authorizer may set several forms of authorization on parts of the database. Among them are the
following:
1. Read Authorization: allows reading, but not modification of data.

2. Insert Authorization: allows insertion of new data, but not the modification of existing data, e.g.
insertion of tuple in a relation.

3. Update authorization: allows modification of data, but not its deletion. But data items like primary-key
attributes may not be modified.

4. Delete authorization: allows deletion of data only.

13

You might also like