0% found this document useful (0 votes)
11 views77 pages

Unit 5

Concurrency control in database management systems ensures simultaneous operations do not conflict, maintaining data integrity. Various techniques like locking protocols, timestamp-based protocols, and validation-based protocols are employed to manage concurrency issues such as lost updates and deadlocks. Additionally, multiple granularity allows for efficient locking of data items by grouping them, enhancing transaction performance.

Uploaded by

ianurags2509
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views77 pages

Unit 5

Concurrency control in database management systems ensures simultaneous operations do not conflict, maintaining data integrity. Various techniques like locking protocols, timestamp-based protocols, and validation-based protocols are employed to manage concurrency issues such as lost updates and deadlocks. Additionally, multiple granularity allows for efficient locking of data items by grouping them, enhancing transaction performance.

Uploaded by

ianurags2509
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 77

UNIT-V

CONCURRENCY CONTROL TECHNIQUES

5.1 Concurrency Control

Concurrency Control in Database Management System is a procedure of managing simultaneous


operations without conflicting with each other. It ensures that Database transactions are performed
concurrently and accurately to produce correct results without violating data integrity of the respective
Database.

Problems with Concurrency control

 Lost Updates occur when multiple transactions select the same row and update the row based
on the value selected
 Uncommitted dependency issues occur when the second transaction selects a row which is
updated by another transaction (dirty read)
 Non-Repeatable Read occurs when a second transaction is trying to access the same row
several times and reads different data each time.
 Incorrect Summary issue occurs when one transaction takes summary over the value of all the
instances of a repeated data-item, and second transaction update few instances of that specific
data-item. In that situation, the resulting summary does not reflect a correct result.

Need of Concurrency control in database system

 To apply Isolation through mutual exclusion between conflicting transactions


 To resolve read-write and write-write conflict issues
 To preserve database consistency through constantly preserving execution obstructions
 The system needs to control the interaction among the concurrent transactions. This control is
achieved using concurrent-control schemes.
 Concurrency control helps to ensure serializability
5.2 Locking Techniques for Concurrency Control

Different concurrency control protocols offer different benefits between the amount of concurrency
they allow and the amount of overhead that they impose. Following are the Concurrency Control
techniques in DBMS:

 Lock-Based Protocols
 Two Phase Locking Protocol
 Timestamp-Based Protocols
 Validation-Based Protocols

Lock-based Protocols

Lock Based Protocols in DBMS is a mechanism in which a transaction cannot Read or Write the data
until it acquires an appropriate lock. Lock based protocols help to eliminate the concurrency problem
in DBMS for simultaneous transactions by locking or isolating a particular transaction to a single user.

A lock is a data variable which is associated with a data item. This lock signifies that operations that
can be performed on the data item. Locks in DBMS help synchronize access to the database items by
concurrent transactions.

All lock requests are made to the concurrency-control manager. Transactions proceed only once the
lock request is granted.

Binary Locks: A Binary lock on a data item can either locked or unlocked states.

Shared/exclusive: This type of locking mechanism separates the locks in DBMS based on their uses.
If a lock is acquired on a data item to perform a write operation, it is called an exclusive lock.

1. Shared Lock (S):

A shared lock is also called a Read-only lock. With the shared lock, the data item can be shared
between transactions. This is because you will never have permission to update data on the data item.
For example, consider a case where two transactions are reading the account balance of a person. The
database will let them read by placing a shared lock. However, if another transaction wants to update
that account's balance, shared lock prevent it until the reading process is over.

2. Exclusive Lock (X):

With the Exclusive Lock, a data item can be read as well as written. This is exclusive and can't be held
concurrently on the same data item. X-lock is requested using lock-x instruction. Transactions may
unlock the data item after finishing the 'write' operation.

For example, when a transaction needs to update the account balance of a person. You can allows this
transaction by placing X lock on it. Therefore, when the second transaction wants to read or write,
exclusive lock prevent this operation.

3. Simplistic Lock Protocol

This type of lock-based protocols allows transactions to obtain a lock on every object before beginning
operation. Transactions may unlock the data item after finishing the 'write' operation.

4. Pre-claiming Locking

Pre-claiming lock protocol helps to evaluate operations and create a list of required data items which
are needed to initiate an execution process. In the situation when all locks are granted, the transaction
executes. After that, all locks release when all of its operations are over.

Starvation

Starvation is the situation when a transaction needs to wait for an indefinite period to acquire a lock.

Following are the reasons for Starvation:

 When waiting scheme for locked items is not properly managed


 In the case of resource leak
 The same transaction is selected as a victim repeatedly
Deadlock

Deadlock refers to a specific situation where two or more processes are waiting for each other to
release a resource or more than two processes are waiting for the resource in a circular chain.

Two Phase Locking Protocol

Two Phase Locking Protocol also known as 2PL protocol is a method of concurrency control in
DBMS that ensures serializability by applying a lock to the transaction data which blocks other
transactions to access the same data simultaneously. Two Phase Locking protocol helps to eliminate
the concurrency problem in DBMS.

This locking protocol divides the execution phase of a transaction into three different parts.

 In the first phase, when the transaction begins to execute, it requires permission for the locks it
needs.
 The second part is where the transaction obtains all the locks. When a transaction releases its
first lock, the third phase starts.
 In this third phase, the transaction cannot demand any new locks. Instead, it only releases the
acquired locks.

The Two-Phase Locking protocol allows each transaction to make a lock or unlock request in two
steps:

 Growing Phase: In this phase transaction may obtain locks but may not release any locks.
 Shrinking Phase: In this phase, a transaction may release locks but not obtain any new lock

It is true that the 2PL protocol offers serializability. However, it does not ensure that deadlocks do not
happen.
Strict Two-Phase Locking Method

Strict-Two phase locking system is almost similar to 2PL. The only difference is that Strict-2PL never
releases a lock after using it. It holds all the locks until the commit point and releases all the locks at
one go when the process is over.

Centralized 2PL

In Centralized 2 PL, a single site is responsible for lock management process. It has only one lock
manager for the entire DBMS.

Primary copy 2PL

Primary copy 2PL mechanism, many lock managers are distributed to different sites. After that, a
particular lock manager is responsible for managing the lock for a set of data items. When the primary
copy has been updated, the change is propagated to the slaves.

Distributed 2PL

In this kind of two-phase locking mechanism, Lock managers are distributed to all sites. They are
responsible for managing locks for data at that site. If no data is replicated, it is equivalent to primary
copy 2PL. Communication costs of Distributed 2PL are quite higher than primary copy 2PL

5.3 Timestamp-based Protocols for Concurrency Control

Timestamp based Protocol in DBMS is an algorithm which uses the System Time or Logical Counter
as a timestamp to serialize the execution of concurrent transactions. The Timestamp-based protocol
ensures that every conflicting read and write operations are executed in a timestamp order.

The older transaction is always given priority in this method. It uses system time to determine the time
stamp of the transaction. This is the most commonly used concurrency protocol.

Lock-based protocols help you to manage the order between the conflicting transactions when they will
execute. Timestamp-based protocols manage conflicts as soon as an operation is created.
The timestamps of the transactions determine the serializability order. Thus, if TS(Ti) < TS(Tj ), then
the system must ensure that the produced schedule is equivalent to a serial schedule in which
transaction Ti appears before transaction Tj . To implement this scheme, we associate with each data
item Q two timestamp values:

• W-timestamp(Q) denotes the largest timestamp of any transaction that executed write(Q)
successfully.

• R-timestamp(Q) denotes the largest timestamp of any transaction that executed read(Q) successfully.

This protocol operates as follows:

1. Suppose that transaction Ti issues read(Q).

a. If TS(Ti) < W-timestamp(Q), then Ti needs to read a value of Q that was already overwritten.
Hence, the read operation is rejected, and Ti is rolled back.

b. If TS(Ti) ≥ W-timestamp(Q), then the read operation is executed, and Rtimestamp(Q) is set to the
maximum of R-timestamp(Q) and TS(Ti).

2. Suppose that transaction Ti issues write(Q).

a. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was needed previously, and the
system assumed that that value would never be produced. Hence, the system rejects the write operation
and rolls Ti back.

b. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete value of Q. Hence, the
system rejects this write operation and rolls Ti back. c. Otherwise, the system executes the write
operation and sets W-timestamp(Q) to TS(Ti). If a transaction Ti is rolled back by the concurrency-
control scheme as result of issuance of either a read or write operation, the system assigns it a new
timestamp and restarts it.

Example:

Suppose there are there transactions T1, T2, and T3.


T1 has entered the system at time 0010
T2 has entered the system at 0020
T3 has entered the system at 0030
Priority will be given to transaction T1, then transaction T2 and lastly Transaction T3.

Advantages:

 Schedules are serializable just like 2PL protocols


 No waiting for the transaction, which eliminates the possibility of deadlocks!

Disadvantages:

Starvation is possible if the same transaction is restarted and continually aborted.

Thomas’ Write Rule

The modification to the timestamp-ordering protocol, called Thomas’ write rule, is this: Suppose that
transaction Ti issues write(Q).

1. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was previously needed, and it
had been assumed that the value would never be produced. Hence, the system rejects the write
operation and rolls Ti back.

2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete value of Q. Hence, this write
operation can be ignored.

3. Otherwise, the system executes the write operation and sets W-timestamp(Q) to TS(Ti).

5.4 Validation Based Protocol

Validation based Protocol in DBMS also known as Optimistic Concurrency Control Technique is a
method to avoid concurrency in transactions. In this protocol, the local copies of the transaction data
are updated rather than the data itself, which results in less interference while execution of the
transaction.

The Validation based Protocol is performed in the following three phases:

1. Read Phase
2. Validation Phase
3. Write Phase

Read Phase

In the Read Phase, the data values from the database can be read by a transaction but the write
operation or updates are only applied to the local data copies, not the actual database.

Validation Phase

In Validation Phase, the data is checked to ensure that there is no violation of serializability while
applying the transaction updates to the database.

Write Phase

In the Write Phase, the updates are applied to the database if the validation is successful, else; the
updates are not applied, and the transaction is rolled back.

To perform the validation test, we need to know when the various phases of transactions Ti took place.
We shall, therefore, associate three different timestamps with transaction Ti:

1. Start(Ti), the time when Ti started its execution.

2. Validation(Ti), the time when Ti finished its read phase and started its validation phase.

3. Finish(Ti), the time when Ti finished its write phase.

We determine the serializability order by the timestamp-ordering technique, using the value of the
timestamp Validation(Ti).
The validation test for transaction Tj requires that, for all transactions Ti with TS(Ti) < TS(Tj ), one of
the following two conditions must hold:

1. Finish(Ti) < Start(Tj ). Since Ti completes its execution before Tj started, the serializability order is
indeed maintained.

2. The set of data items written by Ti does not intersect with the set of data items read by Tj , and Ti
completes its write phase before Tj starts its validation phase (Start(Tj ) < Finish(Ti) < Validation(Tj )).
This condition ensures that the writes of Ti and Tj do not overlap. Since the writes of Ti do not affect
the read of Tj , and since Tj cannot affect the read of Ti, the serializability order is indeed maintained.

This validation scheme is called the optimistic concurrency control scheme since transactions execute
optimistically, assuming they will be able to finish execution and validate at the end

5.5 Multiple Granularity

In the concurrency-control schemes described thus far, we have used each individual data item as the
unit on which synchronization is performed.

There are circumstances, however, where it would be advantageous to group several data items, and to
treat them as one individual synchronization unit. For example, if a transaction Ti needs to access the
entire database, and a locking protocol is used, then Ti must lock each item in the database. Clearly,
executing these locks is time consuming. It would be better if Ti could issue a single lock request to
lock the

Figure:5.1 Granularity Hierarchy


entire database. On the other hand, if transaction Tj needs to access only a few data items, it should not
be required to lock the entire database, since otherwise concurrency is lost.

What is needed is a mechanism to allow the system to define multiple levels of granularity. We can
make one by allowing data items to be of various sizes and defining a hierarchy of data granularities,
where the small granularities are nested within larger ones. Such a hierarchy can be represented
graphically as a tree. A nonleaf node of the multiple-granularity tree represents the data associated with
its descendants. In the tree protocol, each node is an independent data item.

As an illustration, consider the tree of above Figure, which consists of four levels of nodes. The highest
level represents the entire database. Below it is nodes of type area; the database consists of exactly
these areas. Each area in turn has nodes of type file as its children. Each area contains exactly those
files that are its child nodes. No file is in more than one area. Finally, each file has nodes of
type record. As before, the file consists of exactly those records that are its child nodes, and no record
can be present in more than one file.

Each node in the tree can be locked individually. As we did in the two-phase locking protocol, we shall
use shared and exclusive lock modes. When a transaction locks a node, in either shared or exclusive
mode, the transaction also has implicitly locked all the descendants of that node in the same lock mode.
For example, if transaction Ti gets an explicit lock on file Fc of Figure 16.16, in exclusive mode, then
it has an implicit lock in exclusive mode all the records belonging to that file. It does not need to lock
the individual records of Fc explicitly.

Suppose that transaction Tj wishes to lock record rb6 of file Fb. Since Ti has locked Fb explicitly, it
follows that rb6 is also locked (implicitly). But, when Tj issues a lock request for rb6 , rb6 is not
explicitly locked! How does the system determine whether Tj can lock rb6 ? Tj must traverse the tree
from the root to record rb6 . If any node in that path is locked in an incompatible mode, then Tj must
be delayed.
Figure 5.2: Compatibility Matrix

Suppose now that transaction Tk wishes to lock the entire database. To do so, it simply must lock the
root of the hierarchy. Note, however, that Tk should not succeed in locking the root node, since Ti is
currently holding a lock on part of the tree (specifically, on file Fb). But how does the system
determine if the root node can be locked? One possibility is for it to search the entire tree. This
solution, however, de- feats the whole purpose of the multiple-granularity locking scheme. A more
efficient way to gain this knowledge is to introduce a new class of lock modes, called inten- tion lock
modes. If a node is locked in an intention mode, explicit locking is being done at a lower level of the
tree (that is, at a finer granularity). Intention locks are put on all the ancestors of a node before that
node is locked explicitly. Thus, a transaction does not need to search the entire tree to determine
whether it can lock a node successfully. A transaction wishing to lock a node—say, Q—must traverse a
path in the tree from the root to Q. While traversing the tree, the transaction locks the various nodes in
an intention mode.

There is an intention mode associated with shared mode, and there is one with exclusive mode. If a
node is locked in intention-shared (IS) mode, explicit locking is being done at a lower level of the
tree, but with only shared-mode locks. Similarly, if a node is locked in intention-exclusive (IX) mode,
then explicit locking is being done at a lower level, with exclusive-mode or shared-mode locks.
Finally, if a node is locked in shared and intention-exclusive (SIX) mode, the subtree rooted by that
node is locked explicitly in shared mode, and that explicit locking is being done at a lower level with
exclusive-mode locks. The compatibility function for these lock modes is in above figure.

The multiple-granularity locking protocol, which ensures serializability, is this:

Each transaction Ti can lock a node Q by following these rules:

1. It must observe the lock-compatibility function of figure


2. It must lock the root of the tree first, and can lock it in any mode.

3. It can lock a node Q in S or IS mode only if it currently has the parent of Q locked in either IX or IS
mode.

4. It can lock a node Q in X, SIX, or IX mode only if it currently has the parent of Q locked in either
IX or SIX mode.

5. It can lock a node only if it has not previously unlocked any node (that is, Ti is two phase).

6. It can unlock a node Q only if it currently has none of the children of Q locked.

Observe that the multiple-granularity protocol requires that locks be acquired in top- down (root-to-
leaf) order, whereas locks must be released in bottom-up (leaf-to-root) order.

As an illustration of the protocol, consider the tree of Figure 16.16 and these trans- actions:

• Suppose that transaction T18 reads record ra2 in file Fa. Then, T18 needs to lock the database,
area A1, and Fa in IS mode (and in that order), and finally to lock ra2 in S mode.

• Suppose that transaction T19 modifies record ra9 in file Fa. Then, T19 needs to lock the database,
area A1, and file Fa in IX mode, and finally to lock ra9 in X mode.

• Suppose that transaction T20 reads all the records in file Fa. Then, T20 needs to lock the database and
area A1 (in that order) in IS mode, and finally to lock Fa in S mode.

• Suppose that transaction T21 reads the entire database. It can do so after locking the database in S
mode.

We note that transactions T18, T20, and T21 can access the database concurrently. Transaction T19
can execute concurrently with T18, but not with either T20 or T21.

This protocol enhances concurrency and reduces lock overhead. It is particularly useful in applications
that include a mix of

• Short transactions that access only a few data items


• Long transactions that produce reports from an entire file or set of files There is a similar locking
protocol that is applicable to database systems in which data granularities are organized in the form of
a directed acyclic graph. See the bibliographical notes for additional references. Deadlock is possible in
the protocol that we have, as it is in the two-phase locking protocol. There are techniques to reduce
deadlock frequency in the multiple-granularity protocol, and also to eliminate dead- lock entirely.
These techniques are referenced in the bibliographical notes.

5.6 Multiversion Schemes

The concurrency-control schemes discussed thus far ensure serializability by either delaying an
operation or aborting the transaction that issued the operation. For ex- ample, a read operation may be
delayed because the appropriate value has not been written yet; or it may be rejected (that is, the
issuing transaction must be aborted) because the value that it was supposed to read has already been
overwritten. These difficulties could be avoided if old copies of each data item were kept in a system.

In multiversion concurrency control schemes, each write(Q) operation creates a new version of Q.
When a transaction issues a read(Q) operation, the concurrency- control manager selects one of the
versions of Q to be read. The concurrency-control

scheme must ensure that the version to be read is selected in a manner that ensures serializability. It is
also crucial, for performance reasons, that a transaction be able to determine easily and quickly which
version of the data item should be read.

Multiversion Timestamp Ordering

The most common transaction ordering technique used by multiversion schemes is timestamping. With
each transaction Ti in the system, we associate a unique static timestamp, denoted by TS(Ti). The
database system assigns this timestamp before the transaction starts execution, as described in Section
16.2.

With each data item Q, a sequence of versions <Q1, Q2,.. ., Qm> is associated. Each
version Qk contains three data fields:

• Content is the value of version Qk .


• W-timestamp(Qk ) is the timestamp of the transaction that created version Qk .

• R-timestamp(Qk ) is the largest timestamp of any transaction that successfully read version Qk .

A transaction—say, Ti—creates a new version Qk of data item Q by issuing a write(Q) operation. The
content field of the version holds the value written by Ti. The system initializes the W-timestamp and
R-timestamp to TS(Ti). It updates the R-timestamp value of Qk whenever a transaction Tj reads the
content of Qk , and R-timestamp(Qk ) < TS(Tj ).

The multiversion timestamp-ordering scheme presented next ensures serializ- ability. The scheme
operates as follows. Suppose that transaction Ti issues a read(Q) or write(Q) operation. Let Qk denote
the version of Q whose write timestamp is the largest write timestamp less than or equal to TS(Ti).

1. If transaction Ti issues a read(Q), then the value returned is the content of version Qk .

2. If transaction Ti issues write(Q), and if TS(Ti) < R-timestamp(Qk ), then the sys- tem rolls back
transaction Ti. On the other hand, if TS(Ti) = W-timestamp(Qk ), the system overwrites the contents
of Qk ; otherwise it creates a new version of Q.

The justification for rule 1 is clear. A transaction reads the most recent version that comes before it in
time. The second rule forces a transaction to abort if it is “too late” in doing a write. More precisely,
if Ti attempts to write a version that some other transaction would have read, then we cannot allow that
write to succeed.

Versions that are no longer needed are removed according to the following rule.

Suppose that there are two versions, Qk and Qj , of a data item, and that both versions have a W-
timestamp less than the timestamp of the oldest transaction in the system.

Then, the older of the two versions Qk and Qj will not be used again, and can be deleted.

The multiversion timestamp-ordering scheme has the desirable property that a read request never fails
and is never made to wait. In typical database systems, where reading is a more frequent operation than
is writing, this advantage may be of major practical significance.
The scheme, however, suffers from two undesirable properties. First, the reading of a data item also
requires the updating of the R-timestamp field, resulting in two potential disk accesses, rather than one.
Second, the conflicts between transactions are resolved through rollbacks, rather than through waits.
This alternative may be expensive. Section 16.5.2 describes an algorithm to alleviate this problem.

This multiversion timestamp-ordering scheme does not ensure recoverability and cascadelessness. It
can be extended in the same manner as the basic timestamp- ordering scheme, to make it recoverable
and cascadeless.

Multiversion Two-Phase Locking

The multiversion two-phase locking protocol attempts to combine the advantages of multiversion
concurrency control with the advantages of two-phase locking. This protocol differentiates
between read-only transactions and update transactions.

Update transactions perform rigorous two-phase locking; that is, they hold all locks up to the end of the
transaction. Thus, they can be serialized according to their commit order. Each version of a data item
has a single timestamp. The timestamp in this case is not a real clock-based timestamp, but rather is a
counter, which we will call the ts-counter, that is incremented during commit processing.

Read-only transactions are assigned a timestamp by reading the current value of ts-counter before they
start execution; they follow the multiversion timestamp- ordering protocol for performing reads. Thus,
when a read-only transaction Ti issues a read(Q), the value returned is the contents of the version
whose timestamp is the largest timestamp less than TS(Ti).

When an update transaction reads an item, it gets a shared lock on the item, and reads the latest version
of that item. When an update transaction wants to write an item, it first gets an exclusive lock on the
item, and then creates a new version of the data item. The write is performed on the new version, and
the timestamp of the new version is initially set to a value ∞, a value greater than that of any possible
timestamp.

When the update transaction Ti completes its actions, it carries out commit processing: First, Ti sets the
timestamp on every version it has created to 1 more than the value of ts-counter; then, Ti increments ts-
counter by 1. Only one update transaction is allowed to perform commit processing at a time.
As a result, read-only transactions that start after Ti increments ts-counter will see the values updated
by Ti, whereas those that start before Ti increments ts-counter will see the value before the updates
by Ti. In either case, read-only transactions never need to wait for locks. Multiversion two-phase
locking also ensures that schedules are recoverable and cascadeless.

Versions are deleted in a manner like that of multiversion timestamp ordering.

Suppose there are two versions, Qk and Qj , of a data item, and that both versions have a timestamp
less than the timestamp of the oldest read-only transaction in the system. Then, the older of the two
versions Qk and Qj will not be used again and can be deleted.

Multiversion two-phase locking or variations of it are used in some commercial database systems.

5.7 Recovery with Concurrent Transaction

Recovery with concurrent transactions can be done in the following four ways.
1. Interaction with concurrency control
2. Transaction rollback
3. Checkpoints
4. Restart recovery

Interaction with concurrency control:


In this scheme, the recovery scheme depends greatly on the concurrency control scheme that is used.
So, to rollback a failed transaction, we must undo the updates performed by the transaction.

Transaction rollback:
 In this scheme, we rollback a failed transaction by using the log.
 The system scans the log backward a failed transaction, for every log record found in the log the
system restores the data item.
Checkpoints:
 Checkpoints is a process of saving a snapshot of the applications state so that it can restart from
that point in case of failure.
 Checkpoint is a point of time at which a record is written onto the database form the buffers.
 Checkpoint shortens the recovery process.
 When it reaches the checkpoint, then the transaction will be updated into the database, and till
that point, the entire log file will be removed from the file. Then the log file is updated with the
new step of transaction till the next checkpoint and so on.
 The checkpoint is used to declare the point before which the DBMS was in the consistent state,
and all the transactions were committed.

Restart recovery:

 When the system recovers from a crash, it constructs two lists.


 The undo-list consists of transactions to be undone, and the redo-list consists of transaction to be
redone.
 The system constructs the two lists as follows: Initially, they are both empty. The system scans
the log backward, examining each record, until it finds the first <checkpoint> record.
.

5.8 Case Study of Oracle

Operational Data Analytics Insurance firms succeed through their ability to identify and quantify risks
facing their clients. They are under constant and increasing pressure to rapidly consider every available
quantifiable factor to develop profiles of insurance risk. To this end, insurers collect a vast amount of
operational data about policy holders and insured objects. While extremely valuable, this operational
data must often wait to be coaxed into a traditional data warehouse format, for even later assessment by
an analyst. This case study discusses how Mobiliar restructured their multi-database data warehouse
environment to streamline risk analysis using operational data.
The ultimate goal for organizations, like Mobiliar, that want to streamline their analytics is being able
to analyze their data in real time—without having a negative impact on OLTP [online transaction
processing] performance and without having to wait for the classic ETL [extract, transform, and load]
process to load transaction data into the data warehouse.
Astonishing Proof of Concept (PoC) Results

Oracle Database In-Memory has a unique dual format (rows and columns) that maintains the
transactional data in both row and columnar format in memory, enabling real-time analytics to be
performed immediately across all transactions, thereby eliminating delays and reliance on transforming
transactions into a data mart, data warehouse or other analytic store for examination. To prove that
Oracle Database In-Memory could truly allow Mobiliar to use their operational data for real-time
analytics, the database team at Mobiliar set up a proof of concept to test different analytical scenarios.
Key to the proof of concept for Mobiliar was choosing scenarios that represented typical business cases
for the insurance company.

About Oracle Database In-Memory Oracle Database In-Memory transparently accelerates analytic
queries by orders of magnitude, enabling real-time business decisions. It dramatically accelerates data
warehouses and mixed workload OLTP environments. The unique "dual-format" approach
automatically maintains data in both the existing Oracle row format for OLTP operations, and in a new
purely in-memory column format optimized for analytical processing. Both formats are simultaneously
active and transactionally consistent. Embedding the column store into Oracle Database ensures it is
fully compatible with ALL existing features, and requires absolutely no changes in the application
layer.
Important Questions

Q.1: a) Describe major problems associated with concurrent processing with examples.
b) What is the role of locks in avoiding these problems.
c) What is phantom phenomenon
Q.2: What do you mean by multiple granularities? Explain in detail.
Q.3: Define deadlock. Explain deadlock recovery and preverntion techniques
Q.4: Explain multiversion concurrency control in detail.
Q.5: Explain the working of various time stamping protocols for concurrency control
Q.6: Explain the difference between two phase commit protocol and three phase commit protocol.
Q.7: What is meant by the concurrent execution of database transactions in a multiuser system?
Discuss why concurrency control is needed, and give informal examples.
Q.8: What are the problems encountered in distributed DBMS while considering concurrency control
and recovery?

Q 9: Distinguish between data replication and data fragmentation.

Q 10: Explain in detail Validation Based Protocol.


MULTIPLE CHOICE QUESTIONS

1.If a transaction has obtained a __________ lock, it can read but cannot write on the item
a) Shared mode
b) Exclusive mode
c) Read only mode
d) Write only mode
2.If a transaction has obtained a ________ lock, it can both read and write on the item
a) Shared mode
b) Exclusive mode
c) Read only mode
d) Write only mode
3.A transaction can proceed only after the concurrency control manager ________ the lock to the
transaction
a) Grants
b) Requests
c) Allocates
d) None of the mentioned
4.If a transaction can be granted a lock on an item immediately in spite of the presence of another
mode, then the two modes are said to be ________
a) Concurrent
b) Equivalent
c) Compatible
d) Executable
5.A transaction is made to wait until all ________ locks held on the item are released
a) Compatible
b) Incompatible
c) Concurrent
d) Equivalent
6.State true or false: It is not necessarily desirable for a transaction to unlock a data item immediately
after its final access
a) True
b) False
7.The situation where no transaction can proceed with normal execution is known as ________
a) Road block
b) Deadlock
c) Execution halt
d) Abortion
8.The protocol that indicates when a transaction may lock and unlock each of the data items is called as
__________
a) Locking protocol
b) Unlocking protocol
c) Granting protocol
d) Conflict protocol
9.If a transaction Ti may never make progress, then the transaction is said to be ____________
a) Deadlocked
b) Starved
c) Committed
d) Rolled back
10.The two phase locking protocol consists which of the following phases?
a) Growing phase
b) Shrinking phase
c) More than one of the mentioned
d) None of the mentioned
11.If a transaction may obtain locks but may not release any locks then it is in _______ phase
a) Growing phase
b) Shrinking phase
c) Deadlock phase
d) Starved phase
12.If a transaction may release locks but may not obtain any locks, it is said to be in ______ phase
a) Growing phase
b) Shrinking phase
c) Deadlock phase
d) Starved phase
13.Which of the following cannot be used to implement a timestamp
a) System clock
b) Logical counter
c) External time counter
d) None of the mentioned
14.A logical counter is _________ after a new timestamp has been assigned
a) Incremented
b) Decremented
c) Doubled
d) Remains the same
15.W-timestamp(Q) denotes?
a) The largest timestamp of any transaction that can execute write(Q) successfully
b) The largest timestamp of any transaction that can execute read(Q) successfully
c) The smallest timestamp of any transaction that can execute write(Q) successfully
d) The smallest timestamp of any transaction that can execute read(Q) successfully
16.R-timestamp(Q) denotes?
a) The largest timestamp of any transaction that can execute write(Q) successfully
b) The largest timestamp of any transaction that can execute read(Q) successfully
c) The smallest timestamp of any transaction that can execute write(Q) successfully
d) The smallest timestamp of any transaction that can execute read(Q) successfully
17.A ________ ensures that any conflicting read and write operations are executed in timestamp order
a) Organizational protocol
b) Timestamp ordering protocol
c) Timestamp execution protocol
d) 802-11 protocol
18.The default timestamp ordering protocol generates schedules that are
a) Recoverable
b) Non-recoverable
c) Starving
d) None of the mentioned
19.Which of the following timestamp based protocols generates serializable schedules?
a) Thomas write rule
b) Timestamp ordering protocol
c) Validation protocol
d) None of the mentioned
20.State true or false: The Thomas write rule has a greater potential concurrency than the timestamp
ordering protocol
a) True
b) False
21.In timestamp ordering protocol, suppose that the transaction Ti issues read(Q) and TS(Ti)<W-
timestamp(Q), then
a) Read operation is executed
b) Read operation is rejected
c) Write operation is executed
d) Write operation is rejected
22.In timestamp ordering protocol, suppose that the transaction Ti issues write(Q) and TS(Ti)<W-
timestamp(Q), then
a) Read operation is executed
b) Read operation is rejected
c) Write operation is executed
d) Write operation is rejected
23.The _________ requires each transaction executes in two or three different phases in its lifetime
a) Validation protocol
b) Timestamp protocol
c) Deadlock protocol
d) View protocol
24.During __________ phase, the system reads data and stores them in variables local to the
transaction.
a) Read phase
b) Validation phase
c) Write phase
d) None of the mentioned
25.During the _________ phase the validation test is applied to the transaction
a) Read phase
b) Validation phase
c) Write phase
d) None of the mentioned
26.During the _______ phase, the local variables that hold the write operations are copied to the
database
a) Read phase
b) Validation phase
c) Write phase
d) None of the mentioned
27.Read only operations omit the _______ phase
a) Read phase
b) Validation phase
c) Write phase
d) None of the mentioned
28.Which of the following timestamp is used to record the time at which the transaction started
execution?
a) Start(i)
b) Validation(i)
c) Finish(i)
d) Write(i)
29.Which of the following timestamps is used to record the time when a transaction has finished its
read phase?
a) Start(i)
b) Validation(i)
c) Finish(i)
d) Write(i)
30.Which of the following timestamps is used to record the time when a database has completed its
write operation?
a) Start(i)
b) Validation(i)
c) Finish(i)
d) Write(i)
31.State true or false: Locking and timestamp ordering force a wait or rollback whenever a conflict is
detected.
a) True
b) False
32.State true or false: We determine the serializability order of validation protocol by the validation
ordering technique
a) True
b) False
33.In a granularity hierarchy the highest level represents the
a) Entire database
b) Area
c) File
d) Record
34.If a node is locked in an intention mode, explicit locking is done at a lower level of the tree. This is
called
a) Intention lock modes
b) Explicit lock
c) Implicit lock
d) Exclusive lock
35.If a node is locked in ____________ then explicit locking is being done at a lower level, with
exclusive-mode or shared-mode locks.
a) Intention lock modes
b) Intention-shared-exclusive mode
c) Intention-exclusive (IX) mode
d) Intention-shared (IS) mode
36.This validation scheme is called the _________ scheme since transactions execute optimistically,
assuming they will be able to finish execution and validate at the end.
a) Validation protocol
b) Validation-based protocol
c) Timestamp protocol
d) Optimistic concurrency-control
37.The file organization which allows us to read records that would satisfy the join condition by using
one block read is
a) Heap file organization
b) Sequential file organization
c) Clustering file organization
d) Hash files organization
38.DBMS periodically suspends all processing and synchronizes its files and journals through the use
of
a) Checkpoint facility
b) Backup facility
c) Recovery manager
d) Database change log
39.The extent of the database resource that is included with each lock is called the level of
a) Impact
b) Granularity
c) Management
d) DBMS control
40.A condition that occurs when two transactions wait for each other to unlock data is known as a(n)
a) Shared lock
b) Exclusive lock
c) Binary lock
d) Deadlock
Answer Key

1 A 11 A 21 B 31 A
2 B 12 B 22 D 32 B
3 A 13 C 23 A 33 A
4 C 14 A 24 A 34 A
5 A 15 A 25 B 35 C
6 A 16 B 26 C 36 A
7 B 17 B 27 C 37 C
8 A 18 B 28 A 38 A
9 B 19 A 29 B 39 B
10 C 20 A 30 C 40 D

LAST YEAR AKTU QUESTION PAPER SOLUTION


Section-A
1. Attempt all questions in brief

a) What is Relational Algebra?

Relational algebra is a procedural query language, which takes instances of relations as input and
yields instances of relations as output. It uses operators to perform queries. An operator can be
either unary or binary. They accept relations as their input and yield relations as their output.
Relational algebra is performed recursively on a relation and intermediate results are also considered
relations.

The fundamental operations of relational algebra are as follows −

 Select

 Project

 Union

 Set different

 Cartesian product

 Rename

b) Explain normalisation. What is normal form?

Normalization

o Normalization is the process of organizing the data in the database.

o Normalization is used to minimize the redundancy from a relation or set of relations. It is also
used to eliminate the undesirable characteristics like Insertion, Update and Deletion Anomalies.
o Normalization divides the larger table into the smaller table and links them using relationship.

o The normal form is used to reduce redundancy from the database table.

Types of Normal Forms


There are the four types of normal forms:

Normal Description
Form
1NF A relation is in 1NF if it contains an atomic value.
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully
functional dependent on the primary key.
3NF A relation will be in 3NF if it is in 2NF and no transition dependency
exists.
4NF A relation will be in 4NF if it is in Boyce Codd normal form and has no
multi-valued dependency.
5NF A relation is in 5NF if it is in 4NF and not contains any join dependency
and joining should be lossless.

c) What do you mean by aggregation?

In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.

For example: Center entity offers the Course entity act as a single entity in the relationship which is in
a relationship with another entity visitor. In the real world, if a visitor visits a coaching center then he
will never enquiry about the Course only or just about the Center instead he will ask the enquiry about
both.
d)Define super key, candidate key, primary key and foreign key.

Keys
o Keys play an important role in the relational database.

o It is used to uniquely identify any record or row of data from the table. It is also used to
establish and identify relationships between tables.

For example: In Student table, ID is used as a key because it is unique for each student. In PERSON
table, passport_number, license_number, SSN are keys since they are unique for each person.

Types of key:
1. Primary key
o It is the first key which is used to identify one and only one instance of an entity uniquely. An
entity can contain multiple keys as we saw in PERSON table. The key which is most suitable
from those lists become a primary key.
o In the EMPLOYEE table, ID can be primary key since it is unique for each employee. In the
EMPLOYEE table, we can even select License_Number and Passport_Number as primary key
since they are also unique.
o For each entity, selection of the primary key is based on requirement and developers.

2. Candidate key
o A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.
o The remaining attributes except for primary key are considered as a candidate key. The
candidate keys are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the attributes like
SSN, Passport_Number, and License_Number, etc. are considered as a candidate key.

3. Super Key

Super key is a set of an attribute which can uniquely identify a tuple. Super key is a superset of a
candidate key.

For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME) the name of
two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination
can also be a key.

The super key would be EMPLOYEE-ID, (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key
o Foreign keys are the column of the table which is used to point to the primary key of another
table.
o In a company, every employee works in a specific department, and employee and department
are two different entities. So we can't store the information of the department in the employee
table. That's why we link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id as a new attribute in the
EMPLOYEE table.
o Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.

a) What is strong and weak entity set?

Strong Entity
A strong entity is not dependent of any other entity in the schema. A strong entity will always
have a primary key. Strong entities are represented by a single rectangle. The relationship of two
strong entities is represented by a single diamond.
Various strong entities, when combined together, create a strong entity set.

Weak Entity
A weak entity is dependent on a strong entity to ensure the its existence. Unlike a strong entity, a
weak entity does not have any primary key. It instead has a partial discriminator key. A weak
entity is represented by a double rectangle.
The relation between one strong and one weak entity is represented by a double diamond.
Difference between Strong and Weak Entity:

Strong Entity Weak Entity


1. Strong entity always has primary key. While weak entity has partial discriminator key.

2. Strong entity is not dependent of any Weak entity is depend on strong entity.
other entity.
3. Strong entity is represented by single Weak entity is represented by double rectangle.
rectangle.
4. Two strong entity’s relationship is While the relation between one strong and one
represented by single diamond. weak entity is represented by double diamond.
5. Strong entity have either total While weak entity always has total participation.
participation or not.
b) What do you mean by conflict serializable schedule?

Conflict Serializable Schedule


o A schedule is called conflict serializability if after swapping of non-conflicting operations, it
can transform into a serial schedule.
o The schedule will be a conflict serializable if it is conflict equivalent to a serial schedule.

Conflicting Operations

The two operations become conflicting if all conditions satisfy:


1. Both belong to separate transactions.
2. They have the same data item.
3. They contain at least one write operation.

Example:

Swapping is possible only if S1 and S2 are logically equal.

Here, S1 = S2. That means it is non-conflict.

Here, S1 ≠ S2. That means it is conflict.

Conflict Equivalent
In the conflict equivalent, one can be transformed to another by swapping non-conflicting operations.
In the given example, S2 is conflict equivalent to S1 (S1 can be converted to S2 by swapping non-
conflicting operations).

Two schedules are said to be conflict equivalent if and only if:

1. They contain the same set of the transaction.


2. If each pair of conflict operations are ordered in the same way.

Example:

Schedule S2 is a serial schedule because, in this, all operations of T1 are performed before starting any
operation of T2. Schedule S1 can be transformed into a serial schedule by swapping non-conflicting operations
of S1.

After swapping of non-conflict operations, the schedule S1 becomes:

T1 T2
Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)

Since, S1 is conflict serializable.

c) Define concurrency control.

Concurrency Control
o In the concurrency control, the multiple transactions can be executed simultaneously.

o It may affect the transaction result. It is highly important to maintain the order of execution of
those transactions.

Problems of concurrency control

Several problems can occur when concurrent transactions are executed in an uncontrolled manner.
Following are the three problems in concurrency control.

1. Lost updates
2. Dirty read
3. Unrepeatable read

1. Lost update problem


o When two transactions that access the same database items contain their operations in a way
that makes the value of some database item incorrect, then the lost update problem occurs.
o If two transactions T1 and T2 read a record and then update it, then the effect of updating of the
first record will be overwritten by the second update.

Example:
Here,

o At time t2, transaction-X reads A's value.

o At time t3, Transaction-Y reads A's value.

o At time t4, Transactions-X writes A's value on the basis of the value seen at time t2.

o At time t5, Transactions-Y writes A's value on the basis of the value seen at time t3.

o So at time T5, the update of Transaction-X is lost because Transaction y overwrites it without
looking at its current value.
o Such type of problem is known as Lost Update Problem as update made by one transaction is
lost here.

2. Dirty Read
o The dirty read occurs in the case when one transaction updates an item of the database, and then
the transaction fails for some reason. The updated database item is accessed by another
transaction before it is changed back to the original value.
o A transaction T1 updates a record which is read by T2. If T1 aborts then T2 now has values
which have never formed part of the stable database.

Example:
o At time t2, transaction-Y writes A's value.

o At time t3, Transaction-X reads A's value.

o At time t4, Transactions-Y rollbacks. So, it changes A's value back to that of prior to t1.

o So, Transaction-X now contains a value which has never become part of the stable database.

o Such type of problem is known as Dirty Read Problem, as one transaction reads a dirty value
which has not been committed.

3. Inconsistent Retrievals Problem


o Inconsistent Retrievals Problem is also known as unrepeatable read. When a transaction
calculates some summary function over a set of data while the other transactions are updating
the data, then the Inconsistent Retrievals Problem occurs.
o A transaction T1 reads a record and then does some other processing during which the
transaction T2 updates the record. Now when the transaction T1 reads the record, then the new
value will be inconsistent with the previous value.

Example:

Suppose two transactions operate on three accounts.


o Transaction-X is doing the sum of all balance while transaction-Y is transferring an amount 50
from Account-1 to Account-3.
o Here, transaction-X produces the result of 550 which is incorrect. If we write this produced
result in the database, the database will become an inconsistent state because the actual sum is
600.
o Here, transaction-X has seen an inconsistent state of the database.

Concurrency Control Protocol

Concurrency control protocols ensure atomicity, isolation, and serializability of concurrent


transactions. The concurrency control protocol can be divided into three categories:

1. Lock based protocol


2. Time-stamp protocol
3. Validation based protocol
Section-B

2. Attempt any three of the following:

a) Explain data independence and its types.

If a database system is not multi-layered, then it becomes difficult to make any changes in the
database system. Database systems are designed in multi-layers as we learnt earlier.

Data Independence

A database system normally contains a lot of data in addition to users’ data. For example, it
stores data about data, known as metadata, to locate and retrieve data easily. It is rather
difficult to modify or update a set of metadata once it is stored in the database. But as a DBMS
expands, it needs to change over time to satisfy the requirements of the users. If the entire data
is dependent, it would become a tedious and highly complex job.

Metadata itself follows a layered architecture, so that when we change data at one layer, it does not
affect the data at another level. This data is independent but mapped to each other.

Logical Data Independence


Logical data is data about database, that is, it stores information about how data is managed
inside. For example, a table (relation) stored in the database and all its constraints, applied on
that relation. Logical data independence is a kind of mechanism, which liberalizes itself from
actual data stored on the disk. If we do some changes on table format, it should not change the
data residing on the disk.

Physical Data Independence

All the schemas are logical, and the actual data is stored in bit format on the disk. Physical data
independence is the power to change the physical data without impacting the schema or logical
data. For example, in case we want to change or upgrade the storage system itself − suppose
we want to replace hard-disks with SSD − it should not have any impact on the logical data or
schemas.

b) Describe mapping constraint with its types.

Mapping Constraints
o A mapping constraint is a data constraint that expresses the number of entities to which another
entity can be related via a relationship set.
o It is most useful in describing the relationship sets that involve more than two entity sets.

o For binary relationship set R on an entity set A and B, there are four possible mapping
cardinalities. These are as follows:
1. One to one (1:1)
2. One to many (1:M)
3. Many to one (M:1)
4. Many to many (M:M)

One-to-one

In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and an entity in E2
is associated with at most one entity in E1.
One-to-many

In one-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an entity
in E2 is associated with at most one entity in E1.

Many-to-one

In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and an entity in E2
is associated with any number of entities in E1.

Many-to-many

In many-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an
entity in E2 is associated with any number of entities in E1.
c) Define keys. Explain various types of keys

Keys
o Keys play an important role in the relational database.

o It is used to uniquely identify any record or row of data from the table. It is also used to
establish and identify relationships between tables.

For example: In Student table, ID is used as a key because it is unique for each student. In PERSON
table, passport_number, license_number, SSN are keys since they are unique for each person.

Types of key:
1. Primary key
o It is the first key which is used to identify one and only one instance of an entity uniquely. An
entity can contain multiple keys as we saw in PERSON table. The key which is most suitable
from those lists become a primary key.
o In the EMPLOYEE table, ID can be primary key since it is unique for each employee. In the
EMPLOYEE table, we can even select License_Number and Passport_Number as primary key
since they are also unique.
o For each entity, selection of the primary key is based on requirement and developers.

2. Candidate key
o A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.

o The remaining attributes except for primary key are considered as a candidate key. The
candidate keys are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the attributes like
SSN, Passport_Number, and License_Number, etc. are considered as a candidate key.
3. Super Key

Super key is a set of an attribute which can uniquely identify a tuple. Super key is a superset of a
candidate key.

For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME) the name of
two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination
can also be a key.

The super key would be EMPLOYEE-ID, (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key
o Foreign keys are the column of the table which is used to point to the primary key of another
table.
o In a company, every employee works in a specific department, and employee and department
are two different entities. So we can't store the information of the department in the employee
table. That's why we link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id as a new attribute in the
EMPLOYEE table.
o Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.
d) Explain the phantom phenomena. Discuss the timestamp protocol that avoids the
phantom phenomena.

The so-called phantom problem occurs within a transaction when the same query produces different
sets of rows at different times. For example, if a SELECT is executed twice, but returns a row the
second time that was not returned the first time, the row is a “phantom” row.

Timestamp Ordering Protocol


o The Timestamp Ordering Protocol is used to order the transactions based on their Timestamps.
The order of transaction is nothing but the ascending order of the transaction creation.
o The priority of the older transaction is higher that's why it executes first. To determine the
timestamp of the transaction, this protocol uses system time or logical counter.
o The lock-based protocol is used to manage the order between conflicting pairs among
transactions at the execution time. But Timestamp based protocols start working as soon as a
transaction is created.
o Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has entered the
system at 007 times and transaction T2 has entered the system at 009 times. T1 has the higher
priority, so it executes first as it is entered the system first.
o The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write'
operation on a data.

Basic Timestamp ordering protocol works as follows:


1. Check the following condition whenever a transaction Ti issues a Read (X) operation:

o If W_TS(X) >TS(Ti) then the operation is rejected.

o If W_TS(X) <= TS(Ti) then the operation is executed.

o Timestamps of all the data items are updated.

2. Check the following condition whenever a transaction Ti issues a Write(X) operation:

o If TS(Ti) < R_TS(X) then the operation is rejected.

o If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise the
operation is executed.

Where,

TS(TI) denotes the timestamp of the transaction Ti.

R_TS(X) denotes the Read time-stamp of data-item X.

W_TS(X) denotes the Write time-stamp of data-item X.

Advantages and Disadvantages of TO protocol:

o TO protocol ensures serializability since the precedence graph is as follows:

o TS protocol ensures freedom from deadlock that means no transaction ever waits.

o But the schedule may not be recoverable and may not even be cascade- free.
e) What are distributed database? List advantage and disadvantage of data
replication and data fragmentation.

A distributed database is a collection of multiple interconnected databases, which are spread


physically across various locations that communicate via a computer network.

Features

 Databases in the collection are logically interrelated with each other. Often they represent a
single logical database.
 Data is physically stored across multiple sites. Data in each site can be managed by a DBMS
independent of the other sites.
 The processors in the sites are connected via a network. They do not have any multiprocessor
configuration.
 A distributed database is not a loosely connected file system.
 A distributed database incorporates transaction processing, but it is not synonymous with a
transaction processing system.

Data Replication

Data replication is the process of storing separate copies of the database at two or more sites. It is a
popular fault tolerance technique of distributed databases.

Advantages of Data Replication

 Reliability − In case of failure of any site, the database system continues to work since a copy
is available at another site(s).
 Reduction in Network Load − Since local copies of data are available, query processing can
be done with reduced network usage, particularly during prime hours. Data updating can be
done at non-prime hours.
 Quicker Response − Availability of local copies of data ensures quick query processing and
consequently quick response time.
 Simpler Transactions − Transactions require less number of joins of tables located at different
sites and minimal coordination across the network. Thus, they become simpler in nature.
Disadvantages of Data Replication

 Increased Storage Requirements − Maintaining multiple copies of data is associated with


increased storage costs. The storage space required is in multiples of the storage required for a
centralized system.
 Increased Cost and Complexity of Data Updating − Each time a data item is updated, the
update needs to be reflected in all the copies of the data at the different sites. This requires
complex synchronization techniques and protocols.
 Undesirable Application – Database coupling − If complex update mechanisms are not used,
removing data inconsistency requires complex co-ordination at application level. This results
in undesirable application – database coupling.

Some commonly used replication techniques are −

 Snapshot replication

 Near-real-time replication

 Pull replication

Fragmentation

Fragmentation is the task of dividing a table into a set of smaller tables. The subsets of the table are
called fragments. Fragmentation can be of three types: horizontal, vertical, and hybrid (combination
of horizontal and vertical). Horizontal fragmentation can further be classified into two techniques:
primary horizontal fragmentation and derived horizontal fragmentation.

Fragmentation should be done in a way so that the original table can be reconstructed from the
fragments. This is needed so that the original table can be reconstructed from the fragments whenever
required. This requirement is called “reconstructiveness.”

Advantages of Fragmentation

 Since data is stored close to the site of usage, efficiency of the database system is increased.
 Local query optimization techniques are sufficient for most queries since data is locally
available.
 Since irrelevant data is not available at the sites, security and privacy of the database system
can be maintained.

Disadvantages of Fragmentation

 When data from different fragments are required, the access speeds may be very high.
 In case of recursive fragmentations, the job of reconstruction will need expensive techniques.
 Lack of back-up copies of data in different sites may render the database ineffective in case of
failure of a site.

Vertical Fragmentation

In vertical fragmentation, the fields or columns of a table are grouped into fragments. In order to
maintain reconstructiveness, each fragment should contain the primary key field(s) of the table.
Vertical fragmentation can be used to enforce privacy of data.

For example, let us consider that a University database keeps records of all registered students in a
Student table having the following schema.

STUDENT

Regd_No Name Course Address Semester Fees Marks

Now, the fees details are maintained in the accounts section. In this case, the designer will fragment
the database as follows −

CREATE TABLE STD_FEES AS


SELECT Regd_No, Fees
FROM STUDENT;

Horizontal Fragmentation

Horizontal fragmentation groups the tuples of a table in accordance to values of one or more fields.
Horizontal fragmentation should also confirm to the rule of reconstructiveness. Each horizontal
fragment must have all columns of the original base table.
For example, in the student schema, if the details of all students of Computer Science Course needs to
be maintained at the School of Computer Science, then the designer will horizontally fragment the
database as follows −

CREATE COMP_STD AS
SELECT * FROM STUDENT
WHERE COURSE = "Computer Science";

Hybrid Fragmentation

In hybrid fragmentation, a combination of horizontal and vertical fragmentation techniques are used.
This is the most flexible fragmentation technique since it generates fragments with minimal
extraneous information. However, reconstruction of the original table is often an expensive task.

Hybrid fragmentation can be done in two alternative ways −

 At first, generate a set of horizontal fragments; then generate vertical fragments from one or
more of the horizontal fragments.
 At first, generate a set of vertical fragments; then generate horizontal fragments from one or
more of the vertical fragments.

SECTION-C

3. Attempt any one part of the following:


a) Define Join. Explain different types of join.
A SQL Join statement is used to combine data or rows from two or more tables based on a common
field between them. Different types of Joins are:

 INNER JOIN
 LEFT JOIN
 RIGHT JOIN
 FULL JOIN
Consider the two tables below:

Student

StudentCourse

The simplest Join is INNER JOIN.


1. INNER JOIN: The INNER JOIN keyword selects all rows from both the tables as long as the
condition satisfies. This keyword will create the result-set by combining all rows from both the
tables where the condition satisfies i.e value of the common field will be same.
Syntax:
SELECT table1.column1,table1.column2,table2.column1,....

FROM table1

INNER JOIN table2

ON table1.matching_column = table2.matching_column;

table1: First table.


table2: Second table
matching_column: Column common to both the tables.

Note: We can also write JOIN instead of INNER JOIN. JOIN is same as INNER JOIN.

Example Queries(INNER JOIN)

 This query will show the names and age of students enrolled in different courses.

SELECT StudentCourse.COURSE_ID, Student.NAME, Student.AGE FROM Student

INNER JOIN StudentCourse

ON Student.ROLL_NO = StudentCourse.ROLL_NO;
Output:

2. LEFT JOIN: This join returns all the rows of the table on the left side of the join and matching
rows for the table on the right side of join. The rows for which there is no matching row on right
side, the result-set will contain null. LEFT JOIN is also known as LEFT OUTER JOIN.

Syntax:
SELECT table1.column1,table1.column2,table2.column1,....

FROM table1

LEFT JOIN table2

ON table1.matching_column = table2.matching_column;

table1: First table.

table2: Second table

matching_column: Column common to both the tables.

Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are same.
Example Queries(LEFT JOIN):
SELECT Student.NAME,StudentCourse.COURSE_ID

FROM Student

LEFT JOIN StudentCourse

ON StudentCourse.ROLL_NO = Student.ROLL_NO;

Output:

3. RIGHT JOIN: RIGHT JOIN is similar to LEFT JOIN. This join returns all the rows of the table
on the right side of the join and matching rows for the table on the left side of join. The rows for
which there is no matching row on left side, the result-set will contain null. RIGHT JOIN is also
known as RIGHT OUTER JOIN.

Syntax:
SELECT table1.column1,table1.column2,table2.column1,....

FROM table1

RIGHT JOIN table2

ON table1.matching_column = table2.matching_column;

table1: First table.

table2: Second table

matching_column: Column common to both the tables.


Note: We can also use RIGHT OUTER JOIN instead of RIGHT JOIN, both are same.

Example Queries(RIGHT JOIN):


SELECT Student.NAME,StudentCourse.COURSE_ID

FROM Student

RIGHT JOIN StudentCourse

ON StudentCourse.ROLL_NO = Student.ROLL_NO;

Output:

4. FULL JOIN: FULL JOIN creates the result-set by combining result of both LEFT JOIN and
RIGHT JOIN. The result-set will contain all the rows from both the tables. The rows for which
there is no matching, the result-set will contain NULL values.

Syntax:
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1

FULL JOIN table2

ON table1.matching_column = table2.matching_column;

table1: First table.

table2: Second table

matching_column: Column common to both the tables.

Example Queries(FULL JOIN):

SELECT Student.NAME,StudentCourse.COURSE_ID

FROM Student

FULL JOIN StudentCourse

ON StudentCourse.ROLL_NO = Student.ROLL_NO;

Output:
b) Discuss the following terms (i) DDL command (ii) DML Command

SQL Commands
o SQL commands are instructions. It is used to communicate with the database. It is also used to
perform specific tasks, functions, and queries of data.
o SQL can perform various tasks like create a table, add data to tables, drop the table, modify the
table, set permission for users.

Data Definition Language (DDL)


o DDL changes the structure of the table like creating a table, deleting a table, altering a table,
etc.
o All the command of DDL are auto-committed that means it permanently save all the changes in
the database.

Here are some commands that come under DDL:

o CREATE

o ALTER

o DROP

o TRUNCATE

a. CREATE It is used to create a new table in the database.

Syntax:

CREATE TABLE TABLE_NAME (COLUMN_NAME DATATYPES[,....]);

Example:

CREATE TABLE EMPLOYEE(Name VARCHAR2(20), Email VARCHAR2(100), DOB DATE);

b. DROP: It is used to delete both the structure and record stored in the table.
Syntax

DROP TABLE ;

Example

DROP TABLE EMPLOYEE;

c. ALTER: It is used to alter the structure of the database. This change could be either to modify the
characteristics of an existing attribute or probably to add a new attribute.

Syntax:

To add a new column in the table

ALTER TABLE table_name ADD column_name COLUMN-definition;

To modify existing column in the table:

ALTER TABLE MODIFY(COLUMN DEFINITION....);

EXAMPLE

ALTER TABLE STU_DETAILS ADD(ADDRESS VARCHAR2(20));


ALTER TABLE STU_DETAILS MODIFY (NAME VARCHAR2(20));

d. TRUNCATE: It is used to delete all the rows from the table and free the space containing the table.

Syntax:

TRUNCATE TABLE table_name;

Example:

TRUNCATE TABLE EMPLOYEE;


2. Data Manipulation Language
o DML commands are used to modify the database. It is responsible for all form of changes in the
database.
o The command of DML is not auto-committed that means it can't permanently save all the
changes in the database. They can be rollback.

Here are some commands that come under DML:

o INSERT

o UPDATE

o DELETE

a. INSERT: The INSERT statement is a SQL query. It is used to insert data into the row of a table.

Syntax:

INSERT INTO TABLE_NAME


(col1, col2, col3,.... col N)
VALUES (value1, value2, value3, .... valueN);

Or

INSERT INTO TABLE_NAME


VALUES (value1, value2, value3, .... valueN);

For example:

INSERT INTO javatpoint (Author, Subject) VALUES ("Sonoo", "DBMS");

b. UPDATE: This command is used to update or modify the value of a column in the table.

Syntax:
UPDATE table_name SET [column_name1= value1,...column_nameN = valueN] [WHERE CONDITI
ON]

For example:

UPDATE students
SET User_Name = 'Sonoo'
WHERE Student_Id = '3'

c. DELETE: It is used to remove one or more row from a table.

Syntax:

DELETE FROM table_name [WHERE condition];

For example:

DELETE FROM javatpoint


WHERE Author="Sonoo";

4. Attempt any one part of the following:

a) What is tuple relational calculas and domain relational calculas?

Relational Calculus
o Relational calculus is a non-procedural query language. In the non-procedural query language,
the user is concerned with the details of how to obtain the end results.
o The relational calculus tells what to do but never explains how to do.

Types of Relational calculus:


1. Tuple Relational Calculus (TRC)
o The tuple relational calculus is specified to select the tuples in a relation. In TRC, filtering
variable uses the tuples of a relation.
o The result of the relation can have one or more tuples.

Notation:

{T | P (T)} or {T | Condition (T)}

Where

T is the resulting tuples

P(T) is the condition used to fetch T.

For example:

{ T.name | Author(T) AND T.article = 'database' }

OUTPUT: This query selects the tuples from the AUTHOR relation. It returns a tuple with 'name'
from Author who has written an article on 'database'.

TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential (∃) and Universal
Quantifiers (∀).
For example:

{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}

Output: This query will yield the same result as the previous one.

2. Domain Relational Calculus (DRC)


o The second form of relation is known as Domain relational calculus. In domain relational
calculus, filtering variable uses the domain of attributes.
o Domain relational calculus uses the same operators as tuple calculus. It uses logical connectives
∧ (and), ∨ (or) and ┓ (not).

o It uses Existential (∃) and Universal Quantifiers (∀) to bind the variable.

Notation:

{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}

Where

a1, a2 are attributes


P stands for formula built by inner attributes

For example:

{< article, page, subject > | ∈ javatpoint ∧ subject = 'database'}

Output: This query will yield the article, page, and subject from the relational javatpoint, where the
subject is a database.

b) Describe the following terms: (i) Multivalued dependency (ii) Trigger

Multivalued Dependency
o Multivalued dependency occurs when two attributes in a table are independent of each other
but, both depend on a third attribute.
o A multivalued dependency consists of at least two attributes that are dependent on a third
attribute that's why it always requires at least three attributes.

Example: Suppose there is a bike manufacturer company which produces two colors(white and black)
of each model every year.

BIKE_MODEL MANUF_YEAR COLOR


M2011 2008 White
M2001 2008 Black
M3001 2013 White
M3001 2013 Black
M4006 2017 White
M4006 2017 Black

Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and independent of
each other.

In this case, these two columns can be called as multivalued dependent on BIKE_MODEL. The
representation of these dependencies is shown below:

1. BIKE_MODEL → → MANUF_YEAR
2. BIKE_MODEL → → COLOR

This can be read as "BIKE_MODEL multidetermined MANUF_YEAR" and "BIKE_MODEL


multidetermined COLOR".

Trigger

Triggers are stored programs, which are automatically executed or fired when some events occur.
Triggers are, in fact, written to be executed in response to any of the following events −

 A database manipulation (DML) statement (DELETE, INSERT, or UPDATE)


 A database definition (DDL) statement (CREATE, ALTER, or DROP).
 A database operation (SERVERERROR, LOGON, LOGOFF, STARTUP, or SHUTDOWN).
Triggers can be defined on the table, view, schema, or database with which the event is associated.

Benefits of Triggers

Triggers can be written for the following purposes −

 Generating some derived column values automatically

 Enforcing referential integrity

 Event logging and storing information on table access

 Auditing

 Synchronous replication of tables

 Imposing security authorizations

 Preventing invalid transactions

Creating Triggers

The syntax for creating a trigger is −

CREATE [OR REPLACE ] TRIGGER trigger_name


{BEFORE | AFTER | INSTEAD OF }
{INSERT [OR] | UPDATE [OR] | DELETE}
[OF col_name]
ON table_name
[REFERENCING OLD AS o NEW AS n]
[FOR EACH ROW]
WHEN (condition)
DECLARE
Declaration-statements
BEGIN
Executable-statements
EXCEPTION
Exception-handling-statements
END;
Where,

 CREATE [OR REPLACE] TRIGGER trigger_name − Creates or replaces an existing trigger


with the trigger_name.
 {BEFORE | AFTER | INSTEAD OF} − This specifies when the trigger will be executed. The
INSTEAD OF clause is used for creating trigger on a view.
 {INSERT [OR] | UPDATE [OR] | DELETE} − This specifies the DML operation.
 [OF col_name] − This specifies the column name that will be updated.
 [ON table_name] − This specifies the name of the table associated with the trigger.
 [REFERENCING OLD AS o NEW AS n] − This allows you to refer new and old values for
various DML statements, such as INSERT, UPDATE, and DELETE.
 [FOR EACH ROW] − This specifies a row-level trigger, i.e., the trigger will be executed for
each row being affected. Otherwise the trigger will execute just once when the SQL statement
is executed, which is called a table level trigger.
 WHEN (condition) − This provides a condition for rows for which the trigger would fire. This
clause is valid only for row-level triggers.

Example

To start with, we will be using the CUSTOMERS table we had created and used in the previous
chapters −

Select * from customers;


+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
+----+----------+-----+-----------+----------+
The following program creates a row-level trigger for the customers table that would fire for INSERT
or UPDATE or DELETE operations performed on the CUSTOMERS table. This trigger will display
the salary difference between the old values and new values −

CREATE OR REPLACE TRIGGER display_salary_changes


BEFORE DELETE OR INSERT OR UPDATE ON customers
FOR EACH ROW
WHEN (NEW.ID > 0)
DECLARE
sal_diff number;
BEGIN
sal_diff := :NEW.salary - :OLD.salary;
dbms_output.put_line('Old salary: ' || :OLD.salary);
dbms_output.put_line('New salary: ' || :NEW.salary);
dbms_output.put_line('Salary difference: ' || sal_diff);
END;
/

When the above code is executed at the SQL prompt, it produces the following result −

Trigger created.

The following points need to be considered here −

 OLD and NEW references are not available for table-level triggers, rather you can use them for
record-level triggers.
 If you want to query the table in the same trigger, then you should use the AFTER keyword,
because triggers can query the table or change it again only after the initial changes are applied
and the table is back in a consistent state.
 The above trigger has been written in such a way that it will fire before any DELETE or
INSERT or UPDATE operation on the table, but you can write your trigger on a single or
multiple operations, for example BEFORE DELETE, which will fire whenever a record will
be deleted using the DELETE operation on the table.

Triggering a Trigger
Let us perform some DML operations on the CUSTOMERS table. Here is one INSERT statement,
which will create a new record in the table −

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)


VALUES (7, 'Kriti', 22, 'HP', 7500.00 );

When a record is created in the CUSTOMERS table, the above create


trigger, display_salary_changes will be fired and it will display the following result −

Old salary:
New salary: 7500
Salary difference:

Because this is a new record, old salary is not available and the above result comes as null. Let us now
perform one more DML operation on the CUSTOMERS table. The UPDATE statement will update
an existing record in the table −

UPDATE customers
SET salary = salary + 500
WHERE id = 2;

When a record is updated in the CUSTOMERS table, the above create


trigger, display_salary_changes will be fired and it will display the following result −

Old salary: 1500


New salary: 2000
Salary difference: 500

Ques5(a): What do you mean by ACID properties of a transaction? Explain in details.

Ans 5(a): Any transaction must maintain the ACID properties, viz. Atomicity, Consistency, Isolation,
and Durability.

 Atomicity − This property states that a transaction is an atomic unit of processing, that is,
either it is performed in its entirety or not performed at all. No partial update should exist.
 Consistency − A transaction should take the database from one consistent state to another
consistent state. It should not adversely affect any data item in the database.

 Isolation − A transaction should be executed as if it is the only one in the system. There should
not be any interference from the other concurrent transactions that are simultaneously running.

 Durability − If a committed transaction brings about a change, that change should be durable
in the database and not lost in case of any failure.

Example: Let Ti be a transaction that transfers $50 from account A to account B. This transaction can
be defined as
Ti: read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B).
Let us now consider each of the ACID requirements. (For ease of presentation, we consider them in an
order different from the order A-C-I-D).

 Consistency: The consistency requirement here is that the sum of A and B be unchanged by the
execution of the transaction. Without the consistency requirement, money could be created or
destroyed by the transaction! It can be verified easily that, if the database is consistent before an
execution of the transaction, the database remains consistent after the execution of the transaction.
Ensuring consistency for an individual transaction is the responsibility of the application
programmer who codes the transaction. This task may be facilitated by automatic testing of integrity
constraints.
 Atomicity: Suppose that, just before the execution of transaction Ti the values of accounts A and B
are $1000 and $2000, respectively. Now suppose that, during the execution of transaction Ti, a
failure occurs that prevents Ti from completing its execution successfully. Examples of such
failures include power failures, hardware failures, and software errors. Further, suppose that the
failure happened after the write(A) operation but before the write(B) operation. In this case, the
values of accounts A and B reflected in the database are $950 and $2000. The system destroyed $50
as a result of this failure. In particular, wenote that the sum A + B is no longer preserved. Thus,
because of the failure, the state of the system no longer reflects a real state of the world that the
database is supposed to capture. We term such astate an inconsistent state. We must ensure that such
inconsistencies are not visible in a database system. Note, however, that the system must at some
point be in an inconsistent state. Even if transaction Ti is executed to completion, there exists a
point at which the value of account A is $950 and the value of account B is $2000, which is clearly
an inconsistent state. This state, however, is eventually replaced by the consistent state where the
value of account A is $950, and the value of account B is $2050. Thus, if the transaction never
started or was guaranteed to complete, such an inconsistent state would not be visible except during
the execution of the transaction. That is the reason for the atomicity requirement: If the atomicity
property is present, all actions of the transaction are reflected in the database, or none are. The basic
idea behind ensuring atomicity is this: The database system keeps track (on disk) of the old values
of any data on which a transaction performs a write, and, if the transaction does not complete its
execution, the database system restores the old values to make it appear as though the transaction
never executed. Ensuring atomicity is the responsibility of the database system itself; specifically, it
is handled bya component called the transaction-management component.
 Durability: Once the execution of the transaction completes successfully, and the user who initiated
the transaction has been notified that the transfer of funds has taken place, it must be the case that
no system failure will result in a loss of data corresponding to this transfer of funds. The durability
property guarantees that, once a transaction completes successfully, all the updates that it carried out
on the database persist, even if there is a system failure after the transaction completes execution.
 Isolation: Even if the consistency and atomicity properties are ensured for each transaction, if
several transactions are executed concurrently, their operations may interleave in some undesirable
way, resulting in an inconsistent state.

For example, as we saw earlier, the database is temporarily inconsistent while the transaction to
transfer funds from A to B is executing, with the deducted total written to A and the increased total
yet to be written to B. If a second concurrently running transaction reads A and B at this intermediate
point and computes A+B, it will observe an inconsistent value. Furthermore, if this second
transaction then performs updates on A and B based on the inconsistent values that it read, the
database may be left in an inconsistent state even after both transactions have completed.

Ques:5(b): Discuss about Deadlock Prevention Scheme.


Ans:5(b): Deadlock Prevention

To prevent any deadlock situation in the system, the DBMS aggressively inspects all the operations,
where transactions are about to execute. The DBMS inspects the operations and analyzes if they can
create a deadlock situation. If it finds that a deadlock situation might occur, then that transaction is
never allowed to be executed.

There are deadlock prevention schemes that use timestamp ordering mechanism of transactions in
order to predetermine a deadlock situation.

Wait-Die Scheme

In this scheme, if a transaction requests to lock a resource (data item), which is already held with a
conflicting lock by another transaction, then one of the two possibilities may occur −

 If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older than T j − then
Ti is allowed to wait until the data-item is available.

 If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies. Ti is restarted later with a
random delay but with the same timestamp.

This scheme allows the older transaction to wait but kills the younger one.

Wound-Wait Scheme

In this scheme, if a transaction requests to lock a resource (data item), which is already held with
conflicting lock by some another transaction, one of the two possibilities may occur −

 If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Ti wounds Tj. Tj is restarted later
with a random delay but with the same timestamp.

 If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.

This scheme, allows the younger transaction to wait; but when an older transaction requests an item
held by a younger one, the older transaction forces the younger one to abort and release the item.

In both the cases, the transaction that enters the system at a later stage is aborted.

Ques:6Attempt any one part of the following.


(a) What is Concurrency Control? Why it is needed in database system?

Concurrency Control in Database Management System is a procedure of managing simultaneous


operations without conflicting with each other. It ensures that Database transactions are performed
concurrently and accurately to produce correct results without violating data integrity of the respective
Database.

Need of Concurrency control in database system

 To apply Isolation through mutual exclusion between conflicting transactions


 To resolve read-write and write-write conflict issues
 To preserve database consistency through constantly preserving execution obstructions
 The system needs to control the interaction among the concurrent transactions. This control is
achieved using concurrent-control schemes.
 Concurrency control helps to ensure serializability

(b). Consider the following relational DATABASE. Give an expression in SQL for each of the
following queries. Underline records are primary key.

employee (person name, street, city)

works (person name, company-name, salary)

company (company name, city)

manages (person name, manager-name)

Give an expression in SQL for each of the following queries:

(b).

i) Find the names of all employees who works for the ABC Bank

select employee. person name from employee, works

where employee. person name= company. person name and company name=’ABC’;
ii) Find the names of all employees who live in the same city and on the same street as do their
managers.

select p. person name from employee p, employee r, manages m

where p. person name = m. person name and m. manager name = r. person name

and p. street = r. street and p.city = r.city;

iii) Find the names, street address, and cities of residence for all employees who work for ‘ABC
Bank ' and earn more than 7,000 per annum.

select e. person name, employee. Street, employee. City from employee, works

where e. person name =works. person name and company name = 'ABC Bank' and salary >
7000);

iv) Find the names of all employees in the database who earn more than every employee of
XYZ.

select person name from works

where salary > all (select salary from works where company name =’XYZ’;

v) Give all employees of First Bank Corporation a 7 percent.

update works set salary = salary *1 .07

where company name = ’ABC’;

vi) Delete all tuples in the works relation for employees of ABC

delete works

where company-name = ’ABC’;

vii) Find the names of all employees in the database who live in the same cities as the companies
for which they work.

select e. person name from employee e, works w, company c


where e. person name = w. person name and e. city = c. city and and w. company name = c.
company name;

Ques:7:

(a)Explain Directory System.

Directory Systems

In the pre computerization days, organizations would create physical directories of employees and
distribute them across the organization. In general, a directory is a listing of information about some
class of objects such as persons. Directories can be used to find information about a specific object, or
in the reverse direction to find objects that meet a certain requirement. In the world of physical
telephone directories, directories that satisfy lookups in the forward direction are called white pages,
while directories that satisfy lookups in the reverse direction are called yellow pages

Directory Access Protocols

Several directory access protocols have been developed to provide a standardized way of accessing
data in a directory. The most widely used among them today is the Lightweight Directory Access
Protocol (LDAP).
The reasons for using a specialized protocol for accessing directory information:
• First, directory access protocols are simplified protocols that cater to a limited type of access to data.
They evolved in parallel with the database access protocols.
• Second, and more important, directory systems provide a simple mechanism to name objects in a
hierarchical fashion, similar to file system directory names, which can be used in a distributed
directory system to specify what information is stored in each of the directory servers.
LDAP: Lightweight Directory Access Protocol

A directory system is implemented as one or more servers, which service multiple clients. Clients use
the application programmer interface defined by directory system to communicate with the directory
servers. Directory access protocols also define a data model and access control.

The X.500 directory access protocol, defined by the International Organization for Standardization
(ISO), is a standard for accessing directory information. However, the protocol is rather complex, and
is not widely used. The Lightweight Directory Access Protocol (LDAP) provides many of the X.500
features, but with less complexity, and is widely used.

(b) Give the following queries in the relational algebra using the relational schema

Student (id, name)

enrolledIn (id, code)

subject (code, lecturer)

i). What are the names of students enrolled in cs3020?

name( cs3020=code(student enrolledIn))

ii). Which subjects is Hector taking?

code( name=Hector(student enrolledIn))

iii). Who teaches cs1500?

lecturer( code=cs1500(subject))

iv). Who teaches cs1500 or cs3020?

lecturer( code=cs1500 OR code=cs3020(subject))

v). Who teaches at least two different subjects?

lecturer( R.lecturer = S.lecturer AND R.code < > S.code(R S))

vi). What are the names of students in cs1500 or cs3010?

name( code=cs1500(student enrolledIn)) name( code=cs3010(student


enrolledIn))
vii). What are the names of students in both cs1500 and cs1200?

name( code=cs1500(student enrolledIn)) name( code=cs3010(student


enrolledIn))

You might also like