0% found this document useful (0 votes)
433 views9 pages

Answer:: The Different Components of DDBMS Are As Follows

The document discusses the components and types of distributed database systems. The key components include computer nodes, network hardware/software, communication media, transaction processors, and data processors. A distributed DBMS can be homogeneous, with all sites using the same DBMS product, or heterogeneous, with sites using different DBMS products. Fragmentation involves dividing a table into smaller tables or fragments for efficiency. The three types of fragmentation are horizontal, vertical, and hybrid.

Uploaded by

Aqab Butt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
433 views9 pages

Answer:: The Different Components of DDBMS Are As Follows

The document discusses the components and types of distributed database systems. The key components include computer nodes, network hardware/software, communication media, transaction processors, and data processors. A distributed DBMS can be homogeneous, with all sites using the same DBMS product, or heterogeneous, with sites using different DBMS products. Fragmentation involves dividing a table into smaller tables or fragments for efficiency. The three types of fragmentation are horizontal, vertical, and hybrid.

Uploaded by

Aqab Butt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Q # 1: What are the components of distributed database system? Explain with the help of a diagram.

Answer: The different components of DDBMS are as follows:


• Computer workstations or remote devices (sites or nodes) that form the network system. The distributed database system
must be independent of the computer system hardware.
• Network hardware and software additives that live in each computer or device. The community components allow all web
sites to interact and change facts. Because the components—computers, working structures, network hardware, and so on—
are probably to be furnished by distinctive providers, it's miles fine to ensure that distributed database functions can be run
on more than one platforms.
• Communications media that carry the data from one node to another. The DDBMS must be communications media-
independent; that is, it must be able to support several types of communications media.
• The transaction processor (TP), which is the software component found in each computer or device that requests data. The
transaction processor receives and processes the application’s data requests (remote and local). The TP is also known as the
application processor (AP) or the transaction manager (TM).
• The data processor (DP), that is the software program thing dwelling on each computer or device that stores and retrieves
statistics positioned at the web page. The dp is also called the statistics manager (dm). A records processor may additionally
even be a centralized dbms.
The following Figure illustrates the placement of the components and the interaction among them. The communication
among TPs and DPs shown in the figure is made possible through a specific set of rules, or protocols, used by the DDBMS.

The protocols determine how the distributed database system will:


• Interface with the network to transport data and commands between data processors (DPs) and transaction processors (TPs).
• Synchronize all data received from DPs (TP side) and route retrieved data to the appropriate TPs (DP side).
• Ensure common database functions in a distributed system. Such functions include security, concurrency control, backup, and recovery. 

Q # 2: What is the difference between homogeneous and heterogeneous distributed database system?
Answer: A DDBMS may be classified as homogeneous or heterogeneous. In a homogeneous system, all sites use the same DBMS
product. In a heterogeneous system, sites may run different DBMS products, which need not be based on the same underlying data
model, and so the system may be composed of relational, network, hierarchical and object-oriented DBMSs.
Homogeneous systems are much easier to design and manage. This approach provides incremental growth, making the addition of a new
site to the DDBMS easy, and allows increased performance by exploiting the parallel processing capability of multiple sites.
Heterogeneous system usually result when individual sites have implemented their own database and integration is considered at a later
stage. In a heterogeneous system, translations are required to allow communication between different DBMSs. To provide DBMS
transparency, users must be able to make requests in the language of the DBMS at their local site. The system then has the task of
locating the data and performing any necessary translation. 
Q # 3: What is location transparency?
Answer: Location Transparency:
Location transparency ensures that the user can query on any table(s) or fragment(s) of a table as if they were stored locally in the user’s
site. The fact that the table or its fragments are stored at remote site in the distributed database system, should be completely oblivious to
the end user. The address of the remote site(s) and the access mechanisms are completely hidden.
In order to incorporate location transparency, DDBMS should have access to updated and accurate data dictionary and DDBMS
directory which contains the details of locations of data.

Q # 4: What is Fragmentation? Discuss various types of Fragmentations with the help of suitable examples.
Answer:
Fragmentation:
Fragmentation is the task of dividing a table into a set of smaller tables. The subsets of the table are called fragments. Fragmentation
can be of three types: horizontal, vertical, and hybrid (combination of horizontal and vertical). Horizontal fragmentation can further be
classified into two techniques: primary horizontal fragmentation and derived horizontal fragmentation.
Fragmentation should be done in a way so that the original table can be reconstructed from the fragments. This is needed so that the
original table can be reconstructed from the fragments whenever required. This requirement is called “re-constructiveness.”
By fragmenting the relation in DB allows:
 Easy usage of Data: It makes most frequently accessed set of data near to the user. Hence these data can be accessed easily as and
when required by them.
 Efficiency: It in turn increases the efficiency of the query by reducing the size of the table to smaller subset and making them
available with less network access time.
 Security: It provides security to the data. That means only valid and useful records will be available to the actual user. The DB near
to the user will not have any unwanted data in their DB. It will contain only those information’s, which are necessary for them.
 Parallelism: Fragmentation allows user to access the same table at the same time from different locations. Users at different
locations will be accessing the same table in the DB at their location, seeing the data that are meant for them. If they are accessing
the table at one location, then they have to wait for the locks to perform their transactions.
 Reliability: It increases the reliability of fetching the data. If the users are located at different locations accessing the single DB,
then there will be huge network load. This will not guarantee that correct records are fetched and returned to the user. Accessing the
fragment of data in the nearest DB will reduce the risk of data loss and correctness of data.
 Balanced Storage: Data will be distributed evenly among the databases in DDB.

Information about the fragmentation of the data is stored in DDC. When user sends a query, this DDC will determine which fragment to
be accessed and it points that data fragment.
Fragmentation of data can be done according to the DBs and user requirement. But while fragmenting the data, below points should be
kept in mind:
 Completeness: While creating the fragment, partial records in the table should not be considered. Fragmentation should be
performed on whole table’s data to get the correct result. For example, if we are creating fragment on EMPLOYEE table, then
we need to consider whole EMPLOYEE table for constructing fragments. It should not be created on the subset of EMPLOYEE
records.
 Reconstructions: When all the fragments are combined, it should give whole table’s data. That means whole table should be
able to reconstruct using all fragments. For example all fragments’ of EMPLOYEE table in the DB, when combined should give
complete EMPLOYEE table records.
 Disjointedness: There should not be any overlapping data in the fragments. If so, it will be difficult to maintain the consistency
of the data. Effort needs to be put to create same replication in all the copies of data. Suppose we have fragments on
EMPLOYEE table based on location then, there should not be any two fragments having the details of same employee.

There are 3 types of data fragmentations in DDBMS.


 Horizontal Data Fragmentation:
As the name suggests, here the data / records are fragmented horizontally. i.e.; horizontal subset of table data is created and are stored in
different database in DDB.
For example, consider the employees working at different locations of the organization like Pakistan, USA, UK etc. number of employees
from all these locations are not a small number. They are huge in number. When any details of any one employee are required, whole
table needs to be accessed to get the information. Again the employee table may present in any location in the world. But the concept of
DDB is to place the data in the nearest DB so that it will be accessed quickly. Hence what we do is divide the entire employee table data
horizontally based on the location. i.e;

SELECT * FROM EMPLOYEE WHERE EMP_LOCATION = ‘PAKISTAN;


SELECT * FROM EMPLOYEE WHERE EMP_LOCATION = ‘USA’;
SELECT * FROM EMPLOYEE WHERE EMP_LOCATION = ‘UK;

Now these queries will give the subset of records from EMPLOYEE table depending on the location of the employees. These sub set of
data will be stored in the DBs at respective locations. Any insert, update and delete on the employee records will be done on the DBs at
their location and it will be synched with the main table at regular intervals.
Above is the simple example of horizontal fragmentation. This fragmentation can be done with more than one conditions joined by AND
or OR clause. Fragmentation is done based on the requirement and the purpose of DDB.
 Vertical Data Fragmentation:
This is the vertical subset of a relation. That means a relation / table is fragmented by considering the columns of it.

For example consider the EMPLOYEE table with ID, Name, Address, Age, location, DeptID, ProjID. The vertical fragmentation of this
table may be dividing the table into different tables with one or more columns from EMPLOYEE.

SELECT EMP_ID, EMP _FIRST_NAME, EMP_LAST_NAME, AGE FROM EMPLOYEE;


SELECT EMP_ID, STREETNUM, TOWN, STATE, COUNTRY, PIN FROM EMPLOYEE;
SELECT EMP_ID, DEPTID FROM EMPLOYEE;
SELECT EMP_ID, PROJID FROM EMPLOYEE;

This type of fragment will have fragmented details about whole employee. This will be useful when the user needs to query only few
details about the employee. For example consider a query to find the department of the employee. This can be done by querying the third
fragment of the table. Consider a query to find the name and age of an employee whose ID is given. This can be done by querying first
fragment of the table. This will avoid performing ‘SELECT *’ operation which will need lot of memory to query the whole table – to
traverse whole data as well as to hold all the columns.
In this fragment overlapping columns can be seen but these columns are primary key and are hardly changed throughout the life cycle of
the record. Hence maintaining cost of this overlapping column is very least. In addition this column is required if we need to reconstruct
the table or to pull the data from two fragments. Hence it still meets the conditions of fragmentation.
 Hybrid Data Fragmentation:
This is the combination of horizontal as well as vertical fragmentation. This type of fragmentation will have horizontal fragmentation to
have subset of data to be distributed over the DB, and vertical fragmentation to have subset of columns of the table.
As we observe in above diagram, this type of fragmentation can be done in any order. It does not have any particular order. It is solely
based on the user requirement. But it should satisfy fragmentation conditions.
Consider the EMPLOYEE table with below fragmentations.

SELECT EMP_ID, EMP _FIRST_NAME, EMP_LAST_NAME, AGE


FROM EMPLOYEE WHERE EMP_LOCATION = ‘PAKISTAN;
SELECT EMP_ID, DEPTID FROM EMPLOYEE WHERE EMP_LOCATION = ‘INDIA;
SELECT EMP_ID, EMP _FIRST_NAME, EMP_LAST_NAME, AGE
FROM EMPLOYEE WHERE EMP_LOCATION = ‘US;
SELECT EMP_ID, PROJID FROM EMPLOYEE WHERE EMP_LOCATION = ‘US;

Q # 5: Write short note on any three of the following.


a. Remote and distributed transactions.
b. Replication and consideration while designing the distributed data base system
c. Network Transparency.
d. Underlying design principles in COBRA architecture.
e. Deadlock prevention deadlock detection algorithms.

Answer:

a) Network Transparency:
Network transparency is basically one of the properties of distributed database. According to this property, a
distributed database must be network transparent. Network transparency means that a user must be unaware
about the operational details of the network.
Actually in distributed databases when a user wants to access data and if that particular data does not exist on
user computer then it is the responsibility of DBMS to provide the data from any other computer where it
exists. User does not know about this thing as from where data is coming.

b) Deadlock prevention deadlock detection algorithms.


Deadlock Prevention:
The deadlock prevention approach does not allow any transaction to acquire locks that will lead to deadlocks.
The convention is that when more than one transactions request for locking the same data item, only one of
them is granted the lock.
One of the most popular deadlock prevention methods is pre-acquisition of all the locks. In this method, a
transaction acquires all the locks before starting to execute and retains the locks for the entire duration of
transaction. If another transaction needs any of the already acquired locks, it has to wait until all the locks it
needs are available. Using this approach, the system is prevented from being deadlocked since none of the
waiting transactions are holding any lock.
Deadlock Detection:
The deadlock detection and removal approach runs a deadlock detection algorithm periodically and removes
deadlock in case there is one. It does not check for deadlock when a transaction places a request for a lock.
When a transaction requests a lock, the lock manager checks whether it is available. If it is available, the
transaction is allowed to lock the data item; otherwise the transaction is allowed to wait.
c) Remote and distributed transactions:
Distributed transaction: A distributed transaction is a database transaction in which two or more network hosts are
involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating
and managing a global transaction that encompasses all operations against such resources.
Remote transaction: Is a set of numerous database requests which access the data at only one remote database
processing sit
Q # 1: What are the functions of distributed database management system? What are the
advantages and disadvantages of distributed database system?
Answer:
Functions of Distributed database system:
1. Keeping track of data –
The basic function of DDBMS is to keep track of the data distribution, fragmentation and replication by expanding
the DDBMS catalog.
2. Distributed Query Processing –
The basic function of DDBMS is basically its ability to access remote sites and to transmit queries and data among
the various sites via a communication network.
3. Replicated Data Management –
The basic function of DDBMS is basically to decide which copy of a replicated data item to access and to maintain
the consistency of copies of replicated data items.
4. Distributed Database Recovery –
The ability to recover from the individual site crashes and from new types of failures such as failure of
communication links.
5. Security –
The basic function of DDBMS is to execute Distributed Transaction with proper management of the security of the
data and the authorization/access privilege of users.
6. Distributed Directory Management –
A directory basically contains information about data in the database. The directory may be global for the entire
DDB, or local for each site. The placement and distribution of the directory may have design and policy issues.
7. Distributed Transaction Management –
The basic function of DDBMS is its ability to devise execution strategies for queries and transaction that access data
from more than one site and to synchronize the access to distributed data and basically to maintain the integrity of
the complete database.
Advantages of DDBMS
 The database is easier to expand as it is already spread across multiple systems and it is not too complicated to add a
system.
 The distributed database can have the data arranged according to different levels of transparency i.e data with
different transparency levels can be stored at different locations.
 The database can be stored according to the departmental information in an organisation. In that case, it is easier for
an organizational hierarchical access.
 There were a natural catastrophe such as fire or an earthquake all the data would not be destroyed it is stored at
different locations.
 It is cheaper to create a network of systems containing a part of the database. This database can also be easily
increased or decreased.
 Even if some of the data nodes go offline, the rest of the database can continue its normal functions.
Disadvantages of DDBMS
 The distributed database is quite complex and it is difficult to make sure that a user gets a uniform view of the
database because it is spread across multiple locations.
 This database is more expensive as it is complex and hence, difficult to maintain.
 It is difficult to provide security in a distributed database as the database needs to be secured at all the locations it is
stored. Moreover, the infrastructure connecting all the nodes in a distributed database also needs to be secured.
 It is difficult to maintain data integrity in the distributed database because of its nature. There can also be data
redundancy in the database as it is stored at multiple locations.
 The distributed database is complicated and it is difficult to find people with the necessary experience who can
manage and maintain it.
Q # 2: Explain concurrency control mechanism in DDBMS?
Answer:

What is Concurrency Control?

Concurrency control is the procedure in DBMS for managing simultaneous operations without conflicting with each another. Concurrent
access is quite easy if all users are just reading data. There is no way they can interfere with one another. Though for any practical
database, would have a mix of reading and WRITE operations and hence the concurrency is a challenge.

Concurrency control is used to address such conflicts which mostly occur with a multi-user system. It helps you to make sure that
database transactions are performed concurrently without violating the data integrity of respective databases.

Therefore, concurrency control is a most important element for the proper functioning of a system where two or multiple database
transactions that require access to the same data, are executed simultaneously.

Potential problems of Concurrency

Here, are some issues which you will likely to face while using the Concurrency Control method:

 Lost Updates occur when multiple transactions select the same row and update the row based on the value selected

 Uncommitted dependency issues occur when the second transaction selects a row which is updated by another transaction (dirty
read)

 Non-Repeatable Read occurs when a second transaction is trying to access the same row several times and reads different data
each time.

 Incorrect Summary issue occurs when one transaction takes summary over the value of all the instances of a repeated data-
item, and second transaction update few instances of that specific data-item. In that situation, the resulting summary does not
reflect a correct result.

Why use Concurrency method?

Reasons for using Concurrency control method is DBMS:

 To apply Isolation through mutual exclusion between conflicting transactions

 To resolve read-write and write-write conflict issues

 To preserve database consistency through constantly preserving execution obstructions

 The system needs to control the interaction among the concurrent transactions. This control is achieved using concurrent-control
schemes.

 Concurrency control helps to ensure serializability

Example

Assume that two people who go to electronic kiosks at the same time to buy a movie ticket for the same movie and the same show time.

However, there is only one seat left in for the movie show in that particular theatre. Without concurrency control, it is possible that both
moviegoers will end up purchasing a ticket. However, concurrency control method does not allow this to happen. Both moviegoers can
still access information written in the movie seating database. But concurrency control only provides a ticket to the buyer who has
completed the transaction process first.

Concurrency Control Protocols

Different concurrency control protocols offer different benefits between the amount of concurrency they allow and the amount of
overhead that they impose.

 Lock-Based Protocols

 Two Phase

 Timestamp-Based Protocols

 Validation-Based Protocols
Lock-based Protocols
A lock is a data variable which is associated with a data item. This lock signifies that operations that can be performed on the
data item. Locks help synchronize access to the database items by concurrent transactions.
All lock requests are made to the concurrency-control manager. Transactions proceed only once the lock request is granted.
Binary Locks: A Binary lock on a data item can either locked or unlocked states.
Shared/exclusive: This type of locking mechanism separates the locks based on their uses. If a lock is acquired on a data
item to perform a write operation, it is called an exclusive lock.
Two Phase Locking (2PL) Protocol
Two-Phase locking protocol which is also known as a 2PL protocol. It is also called P2L. In this type of locking protocol, the
transaction should acquire a lock after it releases one of its locks.
This locking protocol divides the execution phase of a transaction into three different parts.
 In the first phase, when the transaction begins to execute, it requires permission for the locks it needs.
 The second part is where the transaction obtains all the locks. When a transaction releases its first lock, the third
phase starts.
 In this third phase, the transaction cannot demand any new locks. Instead, it only releases the acquired locks.

The Two-Phase Locking protocol allows each transaction to make a lock or unlock request in two steps:
 Growing Phase: In this phase transaction may obtain locks but may not release any locks.
 Shrinking Phase: In this phase, a transaction may release locks but not obtain any new lock
It is true that the 2PL protocol offers serializability. However, it does not ensure that deadlocks do not happen.
In the above-given diagram, you can see that local and global deadlock detectors are searching for deadlocks and solve them
with resuming transactions to their initial states.

Timestamp-based Protocols
The timestamp-based algorithm uses a timestamp to serialize the execution of concurrent transactions. This protocol ensures
that every conflicting read and write operations are executed in timestamp order. The protocol uses the System Time or
Logical Count as a Timestamp.
The older transaction is always given priority in this method. It uses system time to determine the time stamp of the
transaction. This is the most commonly used concurrency protocol.
Lock-based protocols help you to manage the order between the conflicting transactions when they will execute. Timestamp-
based protocols manage conflicts as soon as an operation is created.

Q # 3:Why is distributed database system said to be scalable?

Case Study

By taking any real time situation, explain how a participating node performs its recovery when it fails during the processing of a
transaction?

Answer:
Scalability:
A scalable system is any system that is flexible with its number of components.
For an efficiently designed distributed system, adding and removing nodes should be an easy task. The system architecture
must be capable of accommodating such changes.You call a system scalable when adding or removing the components
doesn’t make the user sense a difference. The entire system feels like one coherent, logical system.
A distributed system ( system comprising of many servers and probably many networks ) will be called scalable system
when the system is able to give right response to the requests immaterial of traffic coming in - basically as the
computation grows ( may be 1 user now and 1M users in an hour and 1K users in 2nd hour so on… ) and does not fail.
Scalability can be achieved in many ways: horizontal , vertical etc…

You might also like