0% found this document useful (0 votes)
42 views10 pages

Concurrency Control

CONCURRENCY CONTROL

Uploaded by

Samuel Udoema
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views10 pages

Concurrency Control

CONCURRENCY CONTROL

Uploaded by

Samuel Udoema
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

INTRODUCTION

In information technology and computer science, especially in the fields


of computer programming, operating systems, multiprocessors, and databases, 
control ensures that correct results for concurrent operations are generated, while
getting those results as quickly as possible.
Computer systems, both software and hardware, consist of modules, or
components. Each component is designed to operate correctly, i.e., to obey or to
meet certain consistency rules. When components that operate concurrently
interact by messaging or by sharing accessed data (in memory or storage), a certain
component's consistency may be violated by another component. The general area
of concurrency control provides rules, methods, design methodologies,
and theories to maintain the consistency of components operating concurrently
while interacting, and thus the consistency and correctness of the whole system.
Introducing concurrency control into a system means applying operation
constraints which typically result in some performance reduction. Operation
consistency and correctness should be achieved with as good as possible
efficiency, without reducing performance below reasonable levels. Concurrency
control can require significant additional complexity and overhead in a concurrent
algorithm compared to the simpler sequential algorithm.
For example, a failure in concurrency control can result in data

WHAT IS DATABASE CONCURRENCY


Database concurrency is the ability of a database to allow multiple users to affect
multiple transactions. This is one of the main properties that separates a database
from other forms of data storage, like spreadsheets. Other users can read the file,
but may not edit data.
DATABASE TRANSACTION AND THE ACID RULES
The concept of a database transaction (or atomic transaction) has evolved in order
to enable both a well understood database system behavior in a faulty environment
where crashes can happen any time, and recovery from a crash to a well
understood database state. A database transaction is a unit of work, typically
encapsulating a number of operations over a database (e.g., reading a database
object, writing, acquiring lock, etc.), an abstraction supported in database and also
other systems. Each transaction has well defined boundaries in terms of which
program/code executions are included in that transaction (determined by the
transaction's programmer via special transaction commands). Every database
transaction obeys the following rules (by support in the database system; i.e., a
database system is designed to guarantee them for the transactions it runs):

 Atomicity - Either the effects of all or none of its operations remain ("all or
nothing" semantics) when a transaction is completed
(committed or aborted respectively). In other words, to the outside world a
committed transaction appears (by its effects on the database) to be indivisible
(atomic), and an aborted transaction does not affect the database at all. Either
all the operations are done or none of them are.
 Consistency - Every transaction must leave the database in a consistent
(correct) state, i.e., maintain the predetermined integrity rules of the database
(constraints upon and among the database's objects). A transaction must
transform a database from one consistent state to another consistent state
(however, it is the responsibility of the transaction's programmer to make sure
that the transaction itself is correct, i.e., performs correctly what it intends to
perform (from the application's point of view) while the predefined integrity
rules are enforced by the DBMS). Thus since a database can be normally
changed only by transactions, all the database's states are consistent.
 Isolation - Transactions cannot interfere with each other (as an end result of
their executions). Moreover, usually (depending on concurrency control
method) the effects of an incomplete transaction are not even visible to another
transaction. Providing isolation is the main goal of concurrency control.
 Durability - Effects of successful (committed) transactions must persist
through crashes (typically by recording the transaction's effects and its commit
event in a non-volatile memory).
The concept of atomic transaction has been extended during the years to what has
become Business transactions which actually implement types of Workflow and
are not atomic. However also such enhanced transactions typically utilize atomic
transactions as components.

WHY IS CONCURRENCY CONTROL NEEDED


If transactions are executed serially, i.e., sequentially with no overlap in time, no
transaction concurrency exists. However, if concurrent transactions with
interleaving operations are allowed in an uncontrolled manner, some unexpected,
undesirable results may occur, such as:

1. The lost update problem: A second transaction writes a second value of a


data-item (datum) on top of a first value written by a first concurrent
transaction, and the first value is lost to other transactions running
concurrently which need, by their precedence, to read the first value. The
transactions that have read the wrong value end with incorrect results.
2. The dirty read problem: Transactions read a value written by a transaction
that has been later aborted. This value disappears from the database upon
abort, and should not have been read by any transaction ("dirty read"). The
reading transactions end with incorrect results.
3. The incorrect summary problem: While one transaction takes a summary
over the values of all the instances of a repeated data-item, a second
transaction updates some instances of that data-item. The resulting summary
does not reflect a correct result for any (usually needed for correctness)
precedence order between the two transactions (if one is executed before the
other), but rather some random result, depending on the timing of the
updates, and whether certain update results have been included in the
summary or not.
Most high-performance transactional systems need to run transactions concurrently
to meet their performance requirements. Thus, without concurrency control such
systems can neither provide correct results nor maintain their databases
consistently.

LOCKING BASED CONCURRENCY CONTROL PROTOCOLS


Locking-based concurrency control protocols use the concept of locking data
items. A lock is a variable associated with a data item that determines whether
read/write operations can be performed on that data item. Generally, a lock
compatibility matrix is used which states whether a data item can be locked by
two transactions at the same time.
Locking-based concurrency control systems can use either one-phase or two-
phase locking protocols.
One-phase Locking Protocol
In this method, each transaction locks an item before use and releases the lock
as soon as it has finished using it. This locking method provides for maximum
concurrency but does not always enforce serializability.
Two-phase Locking Protocol
In this method, all locking operations precede the first lock-release or unlock
operation. The transaction comprise of two phases. In the first phase, a
transaction only acquires all the locks it needs and do not release any lock.
This is called the expanding or the growing phase. In the second phase, the
transaction releases the locks and cannot request any new locks. This is called
the shrinking phase.
Every transaction that follows two-phase locking protocol is guaranteed to be
serializable. However, this approach provides low parallelism between two
conflicting transactions.

TIMESTAMP CONCURRENCY CONTROL ALGORITHMS


Timestamp-based concurrency control algorithms use a transaction’s timestamp to
coordinate concurrent access to a data item to ensure serializability. A timestamp
is a unique identifier given by DBMS to a transaction that represents the
transaction’s start time.
These algorithms ensure that transactions commit in the order dictated by their
timestamps. An older transaction should commit before a younger transaction,
since the older transaction enters the system before the younger one.
Timestamp-based concurrency control techniques generate serializable schedules
such that the equivalent serial schedule is arranged in order of the age of the
participating transactions.
Some of timestamp based concurrency control algorithms are −
 Basic timestamp ordering algorithm.
 Conservative timestamp ordering algorithm.
 Multiversion algorithm based upon timestamp ordering.
Timestamp based ordering follow three rules to enforce serializability −
 Access Rule − When two transactions try to access the same data item
simultaneously, for conflicting operations, priority is given to the older
transaction. This causes the younger transaction to wait for the older
transaction to commit first.
 Late Transaction Rule − If a younger transaction has written a data item,
then an older transaction is not allowed to read or write that data item. This
rule prevents the older transaction from committing after the younger
transaction has already committed.
 Younger Transaction Rule − A younger transaction can read or write a
data item that has already been written by an older transaction.

OPTIMISTIC CONCURRENCY CONTROL ALGORITHM


In systems with low conflict rates, the task of validating every transaction for
serializability may lower performance. In these cases, the test for serializability is
postponed to just before commit. Since the conflict rate is low, the probability of
aborting transactions which are not serializable is also low. This approach is
called optimistic concurrency control technique.
In this approach, a transaction’s life cycle is divided into the following three
phases −
 Execution Phase − A transaction fetches data items to memory and
performs operations upon them.
 Validation Phase − A transaction performs checks to ensure that
committing its changes to the database passes serializability test.
 Commit Phase − A transaction writes back modified data item in memory
to the disk.
This algorithm uses three rules to enforce serializability in validation phase −
Rule 1 − Given two transactions Ti and Tj, if Ti is reading the data item which
Tj is writing, then Ti’s execution phase cannot overlap with Tj’s commit phase.
Tj can commit only after Ti has finished execution.
Rule 2 − Given two transactions Ti and Tj, if Ti is writing the data item that Tj is
reading, then Ti’s commit phase cannot overlap with Tj’s execution phase. Tj can
start executing only after Ti has already committed.
Rule 3 − Given two transactions Ti and Tj, if Ti is writing the data item which Tj is
also writing, then Ti’s commit phase cannot overlap with Tj’s commit phase.
Tj can start to commit only after Ti has already committed.

CONCURRENCY CONTROL IN DISTRIBUTED SYSTEMS


In this section, we will see how the above techniques are implemented in a
distributed database system.

Distributed Two-phase Locking Algorithm


The basic principle of distributed two-phase locking is same as the basic two-
phase locking protocol. However, in a distributed system there are sites designated
as lock managers. A lock manager controls lock acquisition requests from
transaction monitors. In order to enforce co-ordination between the lock managers
in various sites, at least one site is given the authority to see all transactions and
detect lock conflicts.
Depending upon the number of sites who can detect lock conflicts, distributed
two-phase locking approaches can be of three types −
 Centralized two-phase locking − In this approach, one site is designated as
the central lock manager. All the sites in the environment know the location
of the central lock manager and obtain lock from it during transactions.
 Primary copy two-phase locking − In this approach, a number of sites are
designated as lock control centers. Each of these sites has the responsibility
of managing a defined set of locks. All the sites know which lock control
center is responsible for managing lock of which data table/fragment item.
 Distributed two-phase locking − In this approach, there are a number of
lock managers, where each lock manager controls locks of data items stored
at its local site. The location of the lock manager is based upon data
distribution and replication.
Distributed Timestamp Concurrency Control
In a centralized system, timestamp of any transaction is determined by the
physical clock reading. But, in a distributed system, any site’s local
physical/logical clock readings cannot be used as global timestamps, since they
are not globally unique. So, a timestamp comprises of a combination of site ID
and that site’s clock reading.
For implementing timestamp ordering algorithms, each site has a scheduler that
maintains a separate queue for each transaction manager. During transaction, a
transaction manager sends a lock request to the site’s scheduler. The scheduler
puts the request to the corresponding queue in increasing timestamp order.
Requests are processed from the front of the queues in the order of their
timestamps, i.e. the oldest first.

Conflict Graphs
Another method is to create conflict graphs. For this transaction classes are
defined. A transaction class contains two set of data items called read set and
write set. A transaction belongs to a particular class if the transaction’s read set is
a subset of the class’ read set and the transaction’s write set is a subset of the
class’ write set. In the read phase, each transaction issues its read requests for the
data items in its read set. In the write phase, each transaction issues its write
requests.
A conflict graph is created for the classes to which active transactions belong.
This contains a set of vertical, horizontal, and diagonal edges. A vertical edge
connects two nodes within a class and denotes conflicts within the class. A
horizontal edge connects two nodes across two classes and denotes a write-write
conflict among different classes. A diagonal edge connects two nodes across two
classes and denotes a write-read or a read-write conflict among two classes.
The conflict graphs are analyzed to ascertain whether two transactions within the
same class or across two different classes can be run in parallel.

Distributed Optimistic Concurrency Control Algorithm


Distributed optimistic concurrency control algorithm extends optimistic
concurrency control algorithm. For this extension, two rules are applied −
Rule 1 − According to this rule, a transaction must be validated locally at all sites
when it executes. If a transaction is found to be invalid at any site, it is aborted.
Local validation guarantees that the transaction maintains serializability at the
sites where it has been executed. After a transaction passes local validation test, it
is globally validated.
Rule 2 − According to this rule, after a transaction passes local validation test, it
should be globally validated. Global validation ensures that if two conflicting
transactions run together at more than one site, they should commit in the same
relative order at all the sites they run together. This may require a transaction to
wait for the other conflicting transaction, after validation before commit. This
requirement makes the algorithm less optimistic since a transaction may not be
able to commit as soon as it is validated at a site.

You might also like