0% found this document useful (0 votes)
19 views

Unit 4 dbms LNU

Dbms

Uploaded by

gnimish175
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Unit 4 dbms LNU

Dbms

Uploaded by

gnimish175
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

B.Tech. AIML/CSE LNU BHOPAL.

CS/AL-502: Database Management System

Unit-4
Transaction
A transaction can be defined as a group of tasks. A single task is the minimum processing
unit which cannot be divided further.
o The transaction is a set of logically related operation. It contains a group of tasks.
o A transaction is an action or series of actions. It is performed by a single user to perform
operations for accessing the contents of the database.

Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's account. This small
transaction contains several low-level tasks:
X's Account
1. Open_Account(X)
2. Old_Balance = X.balance
3. New_Balance = Old_Balance - 800
4. X.balance = New_Balance
5. Close_Account(X)
Y's Account
1. Open_Account(Y)
2. Old_Balance = Y.balance
3. New_Balance = Old_Balance + 800
4. Y.balance = New_Balance
5. Close_Account(Y)

Operations of Transaction
Following are the main operations of transaction:
Read(X): Read operation is used to read the value of X from the database and stores it in a buffer in
main memory.
Write(X): Write operation is used to write the value back to the database from the buffer.

Let's take an example to debit transaction from an account which consists of following operations:
1. 1. R(X);
2. 2. X = X - 500;
3. 3. W(X);

Let's assume the value of X before starting of the transaction is 4000.


o The first operation reads X's value from database and stores it in a buffer.
o The second operation will decrease the value of X by 500. So buffer will contain 3500.
o The third operation will write the buffer's value to the database. So X's final value will be
3500.
But it may be possible that because of the failure of hardware, software or power, etc. that
transaction may fail before finished all the operations in the set.

For example: If in the above transaction, the debit transaction fails after executing operation 2
then X's value will remain 4000 in the database which is not acceptable by the bank.

To solve this problem, we have two important operations:


Commit: It is used to save the work done permanently.
Rollback: It is used to undo the work done

ACID Properties
A transaction is a very small unit of a program and it may contain several lowlevel tasks. A
transaction in a database system must maintain Atomicity, Consistency, Isolation,
and Durability − commonly known as ACID properties − in order to ensure accuracy,
completeness, and data integrity.
 Atomicity − This property states that a transaction must be treated as an atomic unit,
that is, either all of its operations are executed or none. There must be no state in a
database where a transaction is left partially completed. States should be defined either
before the execution of the transaction or after the execution/abortion/failure of the
transaction.
 Consistency − The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the
database was in a consistent state before the execution of a transaction, it must remain
consistent after the execution of the transaction as well.
 Durability − The database should be durable enough to hold all its latest updates even if
the system fails or restarts. If a transaction updates a chunk of data in a database and
commits, then the database will hold the modified data. If a transaction commits but the
system fails before the data could be written on to the disk, then that data will be
updated once the system springs back into action.
 Isolation − In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions
will be carried out and executed as if it is the only transaction in the system. No
transaction will affect the existence of any other transaction.

States of Transactions
A transaction in a database can be in one of the following states −

 Active − In this state, the transaction is being executed. This is the initial state of every
transaction.
 Partially Committed − When a transaction executes its final operation, it is said to be in
a partially committed state.
 Failed − A transaction is said to be in a failed state if any of the checks made by the
database recovery system fails. A failed transaction can no longer proceed further.
 Aborted − If any of the checks fails and the transaction has reached a failed state, then
the recovery manager rolls back all its write operations on the database to bring the
database back to its original state where it was prior to the execution of the transaction.
Transactions in this state are called aborted. The database recovery module can select
one of the two operations after a transaction aborts −
o Re-start the transaction
o Kill the transaction
 Committed − If a transaction executes all its operations successfully, it is said to be
committed. All its effects are now permanently established on the database system.

Transaction Properties
The transaction has the four properties. These are used to maintain consistency in a database,
before and after the transaction.
Property of Transaction
1. Atomicity
2. Consistency
3. Isolation
4. Durability

Atomicity
o It states that all operations of the transaction take place at once if not, the transaction is
aborted.
o There is no midway, i.e., the transaction cannot occur partially. Each transaction is
treated as one unit and either run to completion or is not executed at all.
Atomicity involves the following two operations:
Abort: If a transaction aborts then all the changes made are not visible.
Commit: If a transaction commits then all the changes made are visible.

Consistency
o The integrity constraints are maintained so that the database is consistent before and
after the transaction.
o The execution of a transaction will leave a database in either its prior stable state or a
new stable state.
o The consistent property of database states that every transaction sees a consistent
database instance.
o The transaction is used to transform the database from one consistent state to another
consistent state.

Isolation
o It shows that the data which is used at the time of execution of a transaction cannot be
used by the second transaction until the first one is completed.
o In isolation, if the transaction T1 is being executed and using the data item X, then that
data item can't be accessed by any other transaction T2 until the transaction T1 ends.
o The concurrency control subsystem of the DBMS enforced the isolation property.

Durability
o The durability property is used to indicate the performance of the database's consistent
state. It states that the transaction made the permanent changes.
o They cannot be lost by the erroneous operation of a faulty transaction or by the system
failure. When a transaction is completed, then the database reaches a state known as
the consistent state. That consistent state cannot be lost, even in the event of a
system's failure.
o The recovery subsystem of the DBMS has the responsibility of Durability property.
What is the term serializability in DBMS?

A schedule is serialized if it is equivalent to a serial schedule. A concurrent


schedule must ensure it is the same as if executed serially means one after
another. It refers to the sequence of actions such as read, write, abort, commit are
performed in a serial manner.

Example
Let’s take two transactions T1 and T2,

If both transactions are performed without interfering each other then it is called
as serial schedule, it can be represented as follows –

T1 T2
READ1(A)
WRITE1(A)
READ1(B)
C1
READ2(B)
WRITE2(B)
READ2(B)
C2
Non serial schedule − When a transaction is overlapped between the transaction
T1 and T2.

Example
Consider the following example −

T1 T2
READ1(A)
WRITE1(A)
READ2(B)
WRITE2(B)
READ1(B)
WRITE1(B)
READ1(B)

Types of serializability
There are two types of serializability –

View serializability

A schedule is view-serializability if it is viewed equivalent to a serial schedule.

The rules it follows are as follows −

 T1 is reading the initial value of A, then T2 also reads the


initial value of A.
 T1 is the reading value written by T2, then T2 also reads the
value written by T1.
 T1 is writing the final value, and then T2 also has the write
operation as the final value.

Conflict serializability
It orders any conflicting operations in the same way as some serial execution. A pair
of operations is said to conflict if they operate on the same data item and one of
them is a write operation.

That means

 Readi(x) readj(x) - non conflict read-read operation


 Readi(x) writej(x) - conflict read-write operation.
 Writei(x) readj(x) - conflict write-read operation.
 Writei(x) writej(x) - conflict write-write operation.

Where I and j denote two different transactions Ti and Tj.

Precedence graph

It is used to check conflict serializability.

The steps to check conflict serializability are as follows –

 For each transaction T, put a node or vertex in the graph.


 For each conflicting pair, put an edge from Ti to Tj.
 If there is a cycle in the graph then schedule is not conflict serializable
else schedule is conflict serializable.

Example 1
The cycle is present so it is not conflict serializable.
Example 2

The cycle is not present, so it is conflict serializable.


Transaction failure
The transaction failure occurs when it fails to execute or when it reaches a point from
where it can't go any further. If a few transaction or process is hurt, then this is called
as transaction failure.

Reasons for a transaction failure could be -

1. Logical errors: If a transaction cannot complete due to some code error or an internal
error condition, then the logical error occurs.
2. Syntax error: It occurs where the DBMS itself terminates an active transaction
because the database system is not able to execute it. For example, The system
aborts an active transaction, in case of deadlock or resource unavailability.

Recoverability in DBMS
The characteristics of non-serializable schedules are as follows −

 The transactions may or may not be consistent.


 The transactions may or may not be recoverable.

We all know that recoverable and irrecoverable are non-serializable techniques,

Irrecoverable schedules
If a transaction does a dirty read operation from an uncommitted transaction and commits before the
transaction from where it has read the value, then such a schedule is called an irrecoverable
schedule.

Example

Let us consider a two transaction schedules as shown below −

T1 T2

Read(A)

Write(A)
T1 T2

Read(A)
- ///Dirty
Read

- Write(A)

- Commit

Rollback

The above schedule is a irrecoverable because of the reasons mentioned below −

 The transaction T2 which is performing a dirty read operation on A.


 The transaction T2 is also committed before the completion of transaction T1.
 The transaction T1 fails later and there are rollbacks.
 The transaction T2 reads an incorrect value.
 Finally, the transaction T2 cannot recover because it is already committed.

Recoverable Schedules

If any transaction that performs a dirty read operation from an uncommitted transaction and also its
committed operation becomes delayed till the uncommitted transaction is either committed or
rollback such type of schedules is called as Recoverable Schedules.

Example

Let us consider two transaction schedules as given below −

T1 T2
T1 T2

Read(A)

Write(A)

Read(A)
- ///Dirty
Read

- Write(A)

Commit

Commit //
delayed

The above schedule is a recoverable schedule because of the reasons mentioned


below –

 The transaction T2 performs dirty read operation on A.


 The commit operation of transaction T2 is delayed until transaction T1 commits or rollback.
 Transaction commits later.
 In the above schedule transaction T2 is now allowed to commit whereas T1 is not yet
committed.
 In this case transaction T1 is failed, and transaction T2 still has a chance to recover by
rollback.
Log-Based Recovery
o The log is a sequence of records. Log of each transaction is maintained in some
stable storage so that if any failure occurs, then it can be recovered from there.
o If any operation is performed on the database, then it will be recorded in the log.
o But the process of storing the logs should be done before the actual transaction
is applied in the database.

Let's assume there is a transaction to modify the City of a student. The following logs are
written for this transaction.
o When the transaction is initiated, then it writes 'start' log.
1. <Tn, Start>
o When the transaction modifies the City from 'Noida' to 'Bangalore', then another log
is written to the file.
1. <Tn, City, 'Noida', 'Bangalore' >
o When the transaction is finished, then it writes another log to indicate the end of
the transaction.
1. <Tn, Commit>

There are two approaches to modify the database:


1. Deferred database modification:
o The deferred modification technique occurs if the transaction does not modify the
database until it has committed.
o In this method, all the logs are created and stored in the stable storage, and the
database is updated when a transaction commits.
2. Immediate database modification:
o The Immediate modification technique occurs if database modification occurs while the
transaction is still active.
o In this technique, the database is modified immediately after every operation. It follows
an actual database modification.

Recovery using Log records


When the system is crashed, then the system consults the log to find which transactions
needto be undone and which need to be redone.
1. If the log contains the record <Ti, Start> and <Ti, Commit> or <Ti, abort >, then the
Transaction Ti needs to be redone.
2. If log contains record<Tn, Start> but does not contain the record either <Ti, commit> or
<Ti, abort>, then the Transaction Ti needs to be und
Checkpoint
o The checkpoint is a type of mechanism where all the previous logs are removed from the system and
permanently stored in the storage disk.
o The checkpoint is like a bookmark. While the execution of the transaction, such checkpoints are
marked, and the transaction is executed then using the steps of the transaction, the log files will be
created.
o When it reaches to the checkpoint, then the transaction will be updated into the database, and till that
point, the entire log file will be removed from the file. Then the log file is updated with the new step of
transaction till next checkpoint and so on.
o The checkpoint is used to declare a point before which the DBMS was in the consistent state, and all
transactions were committed.
Recovery using Checkpoint
In the following manner, a recovery system recovers the database from this failure:

The recovery system reads log files from the end to start. It reads log files from T4 to T1.
o Recovery system maintains two lists, a redo-list, and an undo-list.
o The transaction is put into redo state if the recovery system sees a log with
o <Tn, Start> and <Tn, Commit> or just <Tn, Commit>. In the redo-list and their previous list, all
the transactions are removed and then redone before saving their logs.
o For example: In the log file, transaction T2 and T3 will have <Tn, Start> and <Tn, Commit>. The
T1 transaction will have only <Tn, commit> in the log file. That's why the transaction is
committed after the checkpoint is crossed. Hence it puts T1, T2 and T3 transaction into redo
list.
o The transaction is put into undo state if the recovery system sees a log with <Tn, Start> but no
commit or abort log found. In the undo-list, all the transactions are undone, and their logs are
removed.
o For example: Transaction T4 will have <Tn, Start>. So T4 will be put into undo list since this
transaction is not yet complete and failed amid.
Types of Checkpoints
There are basically two main types of Checkpoints:
1. Automatic Checkpoint
2. Manual Checkpoint
1. Automatic Checkpoint: These checkpoints occur very frequently like every hour or every
day. These intervals are set by the database administrator. They are generally used by heavy
databases as they are frequently updated, and we can recover the data easily in case of
failure.
2. Manual Checkpoint: These are the checkpoints that are manually set by the database
administrator. Manual checkpoints are generally used for smaller databases. They are
updated very less frequently only when they are set by the database administrator.

Advantages of Checkpoints
 Checkpoints help us in recovering the transaction of the database in case of a random
shutdown of the database.
 It enhancing the consistency of the database in case when multiple transactions are
executing in the database simultaneously.
 It increasing the data recovery process.
 Checkpoints work as a synchronization point between the database and the transaction log
file in the database.
 Checkpoint records in the log file are used to prevent unnecessary redo operations.
 Since dirty pages are flushed out continuously in the background, it has a very low overhead
and can be done frequently.
 Checkpoints provide the baseline information needed for the restoration of the lost state in
the event of a system failure.
 A database checkpoint keeps track of change information and enables incremental database
backup.
 A database storage checkpoint can be mounted, allowing regular file system operations to
be performed.
 Database checkpoints can be used for application solutions which include backup, recovery
or database modifications.

Disadvantages of Checkpoints
1. Database storage checkpoints can only be used to restore from logical errors (E.g. a
human error).
2. Because all the data blocks are on the same physical device, database storage checkpoints
cannot be used to restore files due to a media failure.

Real-Time Applications of Checkpoints


1. Backup and Recovery
2. Performance Optimization
3. Auditing

What are Deadlocks?


Deadlock is a state of a database system having two or more transactions, when each
transaction is waiting for a data item that is being locked by some other transaction. A
deadlock can be indicated by a cycle in the wait-for-graph. This is a directed graph in which
the vertices denote transactions and the edges denote waits for data items.

For example, in the following wait-for-graph, transaction T1 is waiting for data item X which is
locked by T3. T3 is waiting for Y which is locked by T2 and T2 is waiting for Z which is locked by
T1. Hence, a waiting cycle is formed, and none of the transactions can proceed executing.

Deadlock Handling
There are three classical approaches for deadlock handling, namely −
 Deadlock prevention.
 Deadlock avoidance.
 Deadlock detection and removal.
All of the three approaches can be incorporated in both a centralized and a distributed
database system.
Deadlock Prevention
The deadlock prevention approach does not allow any transaction to acquire locks that will
lead to deadlocks. The convention is that when more than one transactions request for
locking the same data item, only one of them is granted the lock.
One of the most popular deadlock prevention methods is pre-acquisition of all the locks. In
this method, a transaction acquires all the locks before starting to execute and retains the
locks for the entire duration of transaction. If another transaction needs any of the already
acquired locks, it has to wait until all the locks it needs are available. Using this approach, the
system is prevented from being deadlocked since none of the waiting transactions are holding
any lock.
Deadlock Avoidance
The deadlock avoidance approach handles deadlocks before they occur. It analyzes the
transactions and the locks to determine whether or not waiting leads to a deadlock.
The method can be briefly stated as follows. Transactions start executing and request data
items that they need to lock. The lock manager checks whether the lock is available. If it is
available, the lock manager allocates the data item and the transaction acquires the lock.
However, if the item is locked by some other transaction in incompatible mode, the lock
manager runs an algorithm to test whether keeping the transaction in waiting state will cause
a deadlock or not. Accordingly, the algorithm decides whether the transaction can wait or one
of the transactions should be aborted.
There are two algorithms for this purpose, namely wait-die and wound-wait. Let us assume
that there are two transactions, T1 and T2, where T1 tries to lock a data item which is already
locked by T2. The algorithms are as follows –

 Wait-Die − If T1 is older than T2, T1 is allowed to wait. Otherwise, if T1 is younger than T2,
T1 is aborted and later restarted.
 Wound-Wait − If T1 is older than T2, T2 is aborted and later restarted. Otherwise, if T1 is
younger than T2, T1 is allowed to wait.

Deadlock Detection and Removal


The deadlock detection and removal approach runs a deadlock detection algorithm
periodically and removes deadlock in case there is one. It does not check for deadlock when a
transaction places a request for a lock. When a transaction requests a lock, the lock manager
checks whether it is available. If it is available, the transaction is allowed to lock the data item;
otherwise the transaction is allowed to wait.
Since there are no precautions while granting lock requests, some of the transactions may be
deadlocked. To detect deadlocks, the lock manager periodically checks if the wait-forgraph
has cycles. If the system is deadlocked, the lock manager chooses a victim transaction from
each cycle. The victim is aborted and rolled back; and then restarted later. Some of the
methods used for victim selection are −
 Choose the youngest transaction.
 Choose the transaction with fewest data items.
 Choose the transaction that has performed least number of updates.
 Choose the transaction having least restart overhead.
 Choose the transaction which is common to two or more cycles.
DBMS Concurrency Control
Concurrency Control is the management procedure that is required for controlling
concurrent execution of the operations that take place on a database.

But before knowing about concurrency control, we should know about concurrent execution.

Concurrent Execution in DBMS


o In a multi-user system, multiple users can access and use the same database at one
time, which is known as the concurrent execution of the database. It means that the
same database is executed simultaneously on a multi-user system by different users.
o While working on the database transactions, there occurs the requirement of using the
database by multiple users for performing different operations, and in that case,
concurrent execution of the database is performed.
o The thing is that the simultaneous execution that is performed should be done in an
interleaved manner, and no operation should affect the other executing operations,
thus maintaining the consistency of the database. Thus, on making the concurrent
execution of the transaction operations, there occur several challenging problems that
need to be solved.

Problems with Concurrent Execution


In a database transaction, the two main operations are READ and WRITE operations. So, there
is a need to manage these two operations in the concurrent execution of the transactions as if
these operations are not performed in an interleaved manner, and the data may become
inconsistent. So, the following problems occur with the Concurrent Execution of the operations:

Lost Update Problems (W - W Conflict)


The problem occurs when two different database transactions perform the read/write
operations on the same database items in an interleaved manner (i.e., concurrent execution)
that makes the values of the items incorrect hence making the database inconsistent.

Dirty Read Problems (W-R Conflict)


The dirty read problem occurs when one transaction updates an item of the database, and
somehow the transaction fails, and before the data gets rollback, the updated database item is
accessed by another transaction. There comes the Read-Write Conflict between both
transaction
Unrepeatable Read Problem (W-R Conflict)
Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two different values
are read for the same database item.
Thus, in order to maintain consistency in the database and avoid such problems that take place in
concurrent execution, management is needed, and that is where the concept of Concurrency Control
comes into role.

Concurrency Control
Concurrency Control is the working concept that is required for controlling and managing the concurrent
execution of database operations and thus avoiding the inconsistencies in the database. Thus, for
maintaining the concurrency of the database, we have the concurrency control protocols.

Lock management
Lock-Based Protocol
In this type of protocol, any transaction cannot read or write data until it acquires anappropriate lock
on it. There are two types of lock:
1. Shared lock:
o It is also known as a Read-only lock. In a shared lock, the data item can only read by the
transaction.
o It can be shared between the transactions because when the transaction holds a lock,then it
can't update the data on the data item.
2. Exclusive lock:
o In the exclusive lock, the data item can be both reads as well as written by thetransaction.
o This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.

Specialized locking techniques


There are four types of lock protocols available:
1. Simplistic lock protocol
It is the simplest way of locking the data while transaction. Simplistic lock-based protocols allow all the
transactions to get the lock on the data before insert or delete or update on it. It will unlock the data
item after completing the transaction.
2. Pre-claiming Lock Protocol
o Pre-claiming Lock Protocols evaluate the transaction to list all the data items on which
they need locks.
o Before initiating an execution of the transaction, it requests DBMS for all the lock on all
those data items.
o If all the locks are granted then this protocol allows the transaction to begin. When the
transaction is completed then it releases all the lock.
o If all the locks are not granted then this protocol allows the transaction to rolls back and
waits until all the locks are granted.
3. Two-phase locking (2PL)
o The two-phase locking protocol divides the execution phase of the transaction into
three parts.
o In the first part, when the execution of the transaction starts, it seeks permission for the
lock it requires.
o In the second part, the transaction acquires all the locks. The third phase is started as
soon as the transaction releases its first lock.
o In the third phase, the transaction cannot demand any new locks. It only releases the
acquired locks.
o There are two phases of 2PL:
o Growing phase: In the growing phase, a new lock on the data item may be acquired by
the transaction, but none can be released.
o Shrinking phase: In the shrinking phase, existing lock held by the transaction may be
released, but no new locks can be acquired.
4. Strict Two-phase locking (Strict-2PL)
o The first phase of Strict-2PL is similar to 2PL. In the first phase, after acquiring all the
locks, the transaction continues to execute normally.
o The only difference between 2PL and strict 2PL is that Strict-2PL does not release a lock
after using it.
o Strict-2PL waits until the whole transaction to commit, and then it releases all the locks
at a time.
o Strict-2PL protocol does not have shrinking phase of lock release.
Database Security
Security of databases refers to the array of controls, tools, and procedures designed to ensure
and safeguard confidentiality, integrity, and accessibility. This tutorial will concentrate on
confidentiality because it's a component that is most at risk in data security breaches.

Security for databases must cover and safeguard the following aspects:
o The database containing data.
o Database management systems (DBMS)
o Any applications that are associated with it.
o Physical database servers or the database server virtual, and the hardware that runs it.
o The infrastructure for computing or network that is used to connect to the database.

Security of databases is a complicated and challenging task that requires all aspects of security
practices and technologies. This is inherently at odds with the accessibility of databases. The
more usable and accessible the database is, the more susceptible we are to threats from
security. The more vulnerable it is to attacks and threats, the more difficult it is to access and
utilize.

Why Database Security is Important?


According to the definition, a data breach refers to a breach of data integrity in databases. The
amount of damage an incident like a data breach can cause our business is contingent on
various consequences or elements.
o Intellectual property that is compromised: Our intellectual property--trade secrets,
inventions, or proprietary methods -- could be vital for our ability to maintain an
advantage in our industry. If our intellectual property has been stolen or disclosed and
our competitive advantage is lost, it could be difficult to keep or recover.
o The damage to our brand's reputation: Customers or partners may not want to
purchase goods or services from us (or deal with our business) If they do not feel they
can trust our company to protect their data or their own.
o The concept of business continuity (or lack of it): Some businesses cannot continue to
function until a breach has been resolved.
o Penalties or fines to be paid for not complying: The cost of not complying with
international regulations like the Sarbanes-Oxley Act (SAO) or Payment Card Industry
Data Security Standard (PCI DSS) specific to industry regulations on data privacy, like
HIPAA or regional privacy laws like the European Union's General Data Protection
Regulation (GDPR) could be a major problem with fines in worst cases in excess of many
million dollars for each violation.
o Costs for repairing breaches and notifying consumers about them: Alongside notifying
customers of a breach, the company that has been breached is required to cover the
investigation and forensic services such as crisis management, triage repairs to the
affected systems, and much more.
Distributed databases
A distributed database is a collection of multiple interconnected databases, which are spread
physically across various locations that communicate via a computer network.
 Databases in the collection are logically interrelated with each other. Often they
represent a single logical database.
 Data is physically stored across multiple sites. Data in each site can be managed by a
DBMS independent of the other sites.
 The processors in the sites are connected via a network. They do not have any
multiprocessor configuration.
 A distributed database is not a loosely connected file system.
 A distributed database incorporates transaction processing, but it is not synonymous
with a transaction processing system.

Distributed Database Management System


A distributed database management system (DDBMS) is a centralized software system that
manages a distributed database in a manner as if it were all stored in a single location.
Features
 It is used to create, retrieve, update and delete distributed databases.
 It synchronizes the database periodically and provides access mechanisms by the virtue
of which the distribution becomes transparent to the users.
 It ensures that the data modified at any site is universally updated.
 It is used in application areas where large volumes of data are processed and accessed
by numerous users simultaneously.
 It is designed for heterogeneous database platforms.
 It maintains confidentiality and data integrity of the databases.

A distributed database is basically a database that is not limited to one system, it is spread
over different sites, i.e, on multiple computers or over a network of computers. A distributed
database system is located on various sites that don’t share physical components. This may
be required when a particular database needs to be accessed by various users globally. It
needs to be managed such that for the users it looks like one single database.

Types of Distributed Database


1. Homogeneous Database:
In a homogeneous database, all different sites store database identically. The operating
system, database management system, and the data structures used – all are the same at all
sites. Hence, they’re easy to manage.
2. Heterogeneous Database:
In a heterogeneous distributed database, different sites can use different schema and
software that can lead to problems in query processing and transactions. Also, a particular
site might be completely unaware of the other sites. Different computers may use a different
operating system, different database application. They may even use different data models
for the database. Hence, translations are required for different sites to communicate.

Applications of Distributed Database:


 It is used in Corporate Management Information System.
 It is used in multimedia applications.
 Used in Military’s control system, Hotel chains etc.
 It is also used in manufacturing control system.
 Distributed database systems can be used in a variety of applications, including e-
commerce, financial services, and telecommunications.

Architecture of Distributed DBMS


There are several different architectures for distributed database systems, including:
Client-server architecture: In this architecture, clients connect to a central server, which
manages the distributed database system. The server is responsible for coordinating
transactions, managing data storage, and providing access control.
Peer-to-peer architecture: In this architecture, each site in the distributed database system is
connected to all other sites. Each site is responsible for managing its own data and
coordinating transactions with other sites.
Federated architecture: In this architecture, each site in the distributed database system
maintains its own independent database, but the databases are integrated through a
middleware layer that provides a common interface for accessing and querying the data.

Advantages of Distributed Database System :


1) There is fast data processing as several sites participate in request processing.
2) Reliability and availability of this system is high.
3) It possess reduced operating cost.
4) It is easier to expand the system by adding more sites.
5) It has improved sharing ability and local autonomy.

Disadvantages of Distributed Database System :


1) The system becomes complex to manage and control.
2) The security issues must be carefully managed.
3) The system require deadlock handling during the transaction processing otherwise
the entire system may be in inconsistent state.
4) There is need of some standardization for processing of distributed database system.

Object oriented data base system


The ODBMS which is an abbreviation for object-oriented database management system is
the data model in which data is stored in form of objects, which are instances of classes.
These classes and objects together make an object-oriented data model.

Components of Object-Oriented Data Model:


The OODBMS is based on three major components, namely: Object structure, Object classes,
and Object identity. These are explained below.

1. Object Structure:
The structure of an object refers to the properties that an object is made up of. These
properties of an object are referred to as an attribute. Thus, an object is a real-world entity
with certain attributes that makes up the object structure. Also, an object encapsulates the
data code into a single unit which in turn provides data abstraction by hiding the
implementation details from the user.
The object structure is further composed of three types of components: Messages, Methods,
and Variables. These are explained below.

1. Messages –
A message provides an interface or acts as a communication medium between an object
and the outside world. A message can be of two types:
 Read-only message: If the invoked method does not change the value of a variable,
then the invoking message is said to be a read-only message.
 Update message: If the invoked method changes the value of a variable, then the
invoking message is said to be an update message.

2. Methods –
When a message is passed then the body of code that is executed is known as a method.
Whenever a method is executed, it returns a value as output. A method can be of two
types:
 Read-only method: When the value of a variable is not affected by a method, then it is
known as the read-only method.
 Update-method: When the value of a variable change by a method, then it is known
as an update method.
3. Variables –
It stores the data of an object. The data stored in the variables makes the object
distinguishable from one another.

2. Object Classes:
An object which is a real-world entity is an instance of a class. Hence first we need to define a
class and then the objects are made which differ in the values they store but share the same
class definition. The objects in turn correspond to various messages and variables stored in
them.

Example –
class CLERK

{ //variables
char name;
string address;
int id;
int salary;

//Messages
char get_name();
string get_address();
int annual_salary();
};
In the above example, we can see, CLERK is a class that holds the object variables and
messages.

Features of ODBMS:
Object-oriented data model: ODBMS uses an object-oriented data model to store and
manage data. This allows developers to work with data in a more natural way, as objects are
similar to the objects in the programming language they are using.
Complex data types: ODBMS supports complex data types such as arrays, lists, sets, and
graphs, allowing developers to store and manage complex data structures in the database.
Automatic schema management: ODBMS automatically manages the schema of the
database, as the schema is defined by the classes and objects in the application code. This
eliminates the need for a separate schema definition language and simplifies the
development process.
High performance: ODBMS can provide high performance, especially for applications that
require complex data access patterns, as objects can be retrieved with a single query.
Data integrity: ODBMS provides strong data integrity, as the relationships between objects
are maintained by the database. This ensures that data remains consistent and correct, even
in complex applications.
Concurrency control: ODBMS provides concurrency control mechanisms that ensure that
multiple users can access and modify the same data without conflicts.
Scalability: ODBMS can scale horizontally by adding more servers to the database cluster,
allowing it to handle large volumes of data.
Support for transactions: ODBMS supports transactions, which ensure that multiple
operations on the database are atomic and consistent.

Advantages:
Supports Complex Data Structures: ODBMS is designed to handle complex data structures,
such as inheritance, polymorphism, and encapsulation. This makes it easier to work with
complex data models in an object-oriented programming environment.
Improved Performance: ODBMS provides improved performance compared to traditional
relational databases for complex data models. ODBMS can reduce the amount of mapping
and translation required between the programming language and the database, which can
improve performance.
Reduced Development Time: ODBMS can reduce development time since it eliminates the
need to map objects to tables and allows developers to work directly with objects in the
database.
Supports Rich Data Types: ODBMS supports rich data types, such as audio, video, images, and
spatial data, which can be challenging to store and retrieve in traditional relational databases.
Scalability: ODBMS can scale horizontally and vertically, which means it can handle larger
volumes of data and can support more users.

Disadvantages:
Limited Adoption: ODBMS is not as widely adopted as traditional relational databases, which
means it may be more challenging to find developers with experience working with ODBMS.
Lack of Standardization: ODBMS lacks standardization, which means that different vendors
may implement different features and functionality.
Cost: ODBMS can be more expensive than traditional relational databases since it requires
specialized software and hardware.
Integration with Other Systems: ODBMS can be challenging to integrate with other systems,
such as business intelligence tools and reporting software.
Scalability Challenges: ODBMS may face scalability challenges due to the complexity of the
data models it supports, which can make it challenging to partition data across multiple
nodes.

Concurrency Control Based on Timestamp Ordering

A key idea in database management systems, concurrency control guarantees


transaction isolation and consistency. A concurrency management mechanism
called timestamp ordering gives each transaction a distinct timestamp and arranges
the transactions according to those timestamps.

Objectives of Timestamp Ordering

The main goal of timestamp ordering is to guarantee serializability, which means


that the order in which transactions are completed must create the same outcomes as
if they were executed serially. The following are the main goals of timestamp
ordering –

 Transaction Ordering − In order for the transaction outcomes and timestamps to


match, the transactions must be carried out in the right order.

 Conflict Resolution − If two transactions are in conflict, the timestamp ordering


mechanism must choose between terminating one of the transactions or
postponing it until the other transaction is finished

 Deadlock Prevention − To avoid deadlocks, which occur while several


transactions are awaiting one another's completion, the timestamp ordering
mechanism must be used.

How Timestamp Ordering Works?

The timestamp ordering algorithm works by assigning a unique timestamp to each


transaction when it arrives in the system. The timestamp reflects the transaction's
start time, and it is used to order the transactions for execution. The algorithm
consists of two phases: the validation phase and the execution phase.
 Validation Phase − The timestamp ordering method verifies each transaction's
timestamp during the validation stage to make sure the transactions are performed
in the proper sequence. When one transaction's timestamp is lower than another's,
the earlier transaction must be carried out.

 Execution Phase − In the execution phase, the timestamp ordering algorithm


executes the transactions in the order determined by the validation phase. If there
is a conflict between transactions, the algorithm uses a conflict resolution strategy
to resolve the conflict. One strategy is to abort the transaction with the lower
timestamp, while another strategy is to delay the transaction with the lower
timestamp until the other transaction completes.

Benefits of Timestamp Ordering

The benefits of timestamp ordering are as follows –

 Transaction Consistency − Transaction consistency is ensured by the timestamp


ordering method, which implies that regardless of how the transactions are
conducted, the outcomes are the same as if they were carried out serially.

 High Concurrency − High concurrency is made possible via the timestamp


ordering mechanism, which permits several transactions to run concurrently.

 Deadlock Prevention − While two or more transactions are awaiting one


another's completion, a deadlock is avoided thanks to the timestamp ordering
method.

You might also like