0% found this document useful (0 votes)
282 views13 pages

2006

The document discusses topics related to distributed databases including components of distributed database systems, computing models used, replication, location transparency, availability, benefits of client-server models, and the two-phase commit protocol. It provides multiple choice questions to test understanding of these topics.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
282 views13 pages

2006

The document discusses topics related to distributed databases including components of distributed database systems, computing models used, replication, location transparency, availability, benefits of client-server models, and the two-phase commit protocol. It provides multiple choice questions to test understanding of these topics.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

CS/B.

TECH (CSE)/SEM-7/CS-704 A/06 3

ENGINEERING & MANAGEMENT EXAMINATIONS, DECEMBER - 200.6

DISTRIBUTED DATABASE
(SEMESTER—7)

Time: 3 Hours Full Marks: 70


GROUP-A
(Multiple Choice Questions)

Choose the correct alternatives for the following: 10 x 1 = 10

1) which of the following is a component of a distributed database system?


a) Server b) Client
c), Network d) All of these. d
_____

II) which of the following computing models is used by distributed database system?
a) Mainframe computing model.
b) Disconnected, personal computing model
c) Client / Server computing model c
d) none of these.

III) Which of the following refers to the operation of copying and maintaining database objects in multiple databases
belonging to a distributed system?
a) Backup b) Recovery
c) Replication d) None of these. c

iv) Which of the following refers to the fact the command used to perform a task independent of the location of the
data and location of the system where the command used?
a) Naming transparency
b) Location transparency
c) Fragmentation transparency
d) All of these, d

v) Which of the following is the probability that the system is up and running at a certain point in time?
a) Reliability b) Availability
c) Maintainability d) None of these. c
C8/B.TECH (CSE)/SEM-7/CS-704 A/06 4

vi) Which of the following is not a benefit of site autonomy?


a) A global catalog is not necessary to access global data
b)
c) Administrator can recover from isolated system failure independently
d) There is no need for backup and recovery. a

vii) Which of. the following is the recovery management technique in the entire distributed system?
. a) Deferred update b) Immediate update
c) Two-phase commit d) None of these. a

viii) Which of the following is a function of a DDBMS?


a) Distributed query processing b) Replicated data management
c) Distributed data recovery d) All of these. d

ix) What is data about data called?


a) Metadata b) Data catalog
c) Information d) none of these. a

x) Which of the following strategies is designed to ensure that either all the databases are updated or none of them
are?
a) Two-phase commit b) two - phase locking
c) two-phase update N) none of these d
Group—B
(Short Answer Questions)

2. Compare and contrast between Replication and Fragmentation?

Ans. Both are used in distributed database for increased parallelism.


Replication means several copies are stored consistently in different sites.
Availability is increased. We may send same query to different sites.
In Fragmentation a table is portioned into separate parts stored at different sites.
Here only a subset of the relation is used in different sites. It has also different levels. Fragmentation
has increased response time.
As, replication has several copies at several sites it takes more space but
fragmentation takes less space storage.

3. Why is global query optimization difficult in DBMS?

Ans. Query processing and optimization is much more difficult in distributed environment than in
centralized environment because:
• A large number of parameters affect the performance of distributed queries.
• Relations involved in a distributed query may be fragmented and/or replicated.
• With many sites to access, query response time may become very high.
It is quiet evident that performance of DBMS is critically dependent upon
the ability of the query optimization algorithm to derive efficient query processing
strategies.DDBS query optimization algorithms attempts to reduce the quantity of data
transferred.Minimising the quality of data transferred is a desirable optimization criterion
since more data transported across telecommunication networks require more time and
labour.The distributed query optimization has several problems that relate to: cost model,
large set of queries, optimization cost and execution cost trade-off, and
optimization/reoptimization interval.

6. What are the benefits of a Client-Server model?

Ans. There are certain benefits of this model. They are the following-
• It is possible to repair, replace, upgrade or even relocate a server while its client remains
both unaware and unaffected.
• Since data storage is centralized, updates to the data are far easier to the administrator.
• Many mature client server technologies are already available which were designed to ensure
security, friendliness of the user interface, and easy to use.
• It functions with multiple different clients with different capabilities.
7.
a. Simplify the following query using idempotency rules:-
SELECT ENO
From ASG
Where (NOT (TITLE=”programmer”)
AND (TITLE=”programmer”)
OR (TITLE=”elect.engg”)
AND NOT (TITLE=”ELECT.ENGG”))
OR ENAME=”J.Das”

ANS:-
Let P1 be <TITLE=”programmer”>
P2 be <TITLE=”elect.engg”>
P3 be <ENAME=”J.Das”>
Then the query becomes (¬ P1 ^ (P1 ˇ P2) ^ ¬ P2) ˇ P3
Therefore, the disjunctive normal form is( ¬ P1^ (( P1^ ¬ P2) ˇ ( P2 ^ ¬P2))) ˇ P3
It is obtained by applying rule 5 that is, P1 ^ (P2 ˇ P3) <=> (P1 ^ P2) ˇ (P1 ^ P3)
Then apply following rule:-
P1 ^ (P2 ^ P3) <=> (P1 ^ P2) ^ P3
The reduced query is then,
(¬P1 ^ P1 ^ ¬P2) ˇ (¬P1 ^ P2 ^ ¬P2) ˇ P3
=> (false ^ ¬P2) ˇ (¬P1 ^ false) ˇ P3
=> (false ˇ false) ˇ P3
=>P3
Therefore the result of simplified query is P3 i.e., ENAME=J.Das

b) Describe the two phase commit protocol. What are the demerits on this protocol?
Ans:
We are listing down all the testcases for the Two phase commit protocol and the one phase
commit protocols. For executing and testing the following testcases you only need to run the
participating site for the participants once. and then you need to run the CoordinateSite 9 times
for each of the testcases. We have written a simmulation function in the participating site for
demonstration that takes care of diffrent testcases automatically.In that function we change the
state of the transaction to Precommit or Abort, and set the flag for deferred constraints checks
to be pass or fail. Based on the requirements of the different testcases, diffrent values are set for
these variables and the protocol then shows the corresponding behaviors. The correctness can
be ensured from the logs and the debug messages left for demonstration.

1. Transaction is successfully executed. It reaches the precommit state and votes yes. No
delays takes place so no site fails. The output is PREPARE T and COMMIT log in
the coordinator sites log file and the READY and COMMIT message in the logs of the
participating sites.

2. Transaction is successfully executed. It reaches the precommit state and votes yes. During
precommit delay takes place at participating sites so participating site fails. The output
is PREPARE T and ABORT log in the coordinator sites log file and the READY and
ABORT message in the logs of the participating sites.

3. Transaction is successfully executed. It reaches the precommit state and votes yes. During
commit delays takes place so participating site fails. The output is PREPARE T and
COMMIT log in the coordinator sites log file and the READY and COMMIT message in
the logs of the participating sites except for the sites where commit delayed.
4. Transaction is successfully executed. It reaches the precommit state, deferred constraints
failed and it votes NO. No delays takes place so no site fails. The output is PREPARE T
log in the coordinator sites log file and the READY and ABORT message in the logs of
the participating sites.

5. Transaction is successfully executed and reaches the precommit state. Deferred constraints
failed and site votes NO. During precommit delay takes place at participating sites so
participating site fails. The output is PREPARE T log in the coordinator sites log file
and the READY and ABORT message in the logs of the participating sites.

6. Transaction is successfully executed and reaches the precommit state. Deferred constraints
failed and site votes NO. During abort delays takes place so participating site fails. The
output is PREPARE T log in the coordinator sites log file and the READY and ABORT
message in the logs of the participating sites except for the sites where abort delayed.

7. Transaction fails and goes to abort state ,sites votes NO. No delays takes place so no site
fails. The output is PREPARE T log in the coordinator sites log file and the READY and
ABORT message in the logs of the participating sites.

8. Transaction fails and goes to abort state ,sites votes NO. During precommit delay takes
place at participating sites so participating site fails. The output is PREPARE T log in
the coordinator sites log file and the READY and ABORT message in the logs of the
participating sites.

9. Transaction fails and goes to abort state , votes NO. During abort delays takes place so
participating site fails. The output is PREPARE T log in the coordinator sites log file
and the READY and ABORT message in the logs of the participating sites.
This is the most widely used protocol which commences in two phases
Voting , to ensure that all sites are ready to commit
Decision , to ensure uniformity at abort or commit at all sites.
2PC is largely used in all distributed systems.

1. Advantages
• It ensures atomicity even in the presence of deferred constraints.
• It ensures independent recovery of all sites.
• Since it takes place in twophases, it can handle network failures, disconnections and
in their presence assure atomicity.

2. Disadvantages
• Involves a great deal of message complexity.
• Greater communication overheads as compared to simple optimistic protocols.
• Blocking of site nodes in case of failure of coordinator.
• Multiple forced writes of log, which increase latency.
• Its performance is again a trade off,especially for short lived transactions, like Internet
applications.
8.
a) describe how the deadlock is detected in distributed system?
b) what is the false deadlock?
c)what is the different between the reliability and the availability?
Ans

a)
An algorithm for detecting deadlocks in a distributed system was proposed by Chaudy, Misra, and
Haas in 1983. It allows that processes to request multiple resources at once (this speeds up the
growing phase). Some processes may wait for resources (either local or remote). Cross-machine arcs
make looking for cycles (detecting deadlock) hard.
The algorithm works this way: When a process has to wait for a resource, a probe message is sent
to the process holding the resource. The probe message contains three components: the process that
blocked, the process that is sending the request, and the destination. Initially, the first two
components will be the same. When a process receives the probe: if the process itself is waiting on a
resource, it updates the sending and destination fields of the message and forwards it to the resource
holder. If it is waiting on multiple resources, a message is sent to each process holding the resources.
This process continues as long as processes are waiting for resources. If the originator gets a message
and sees its own process number in the blocked field of the message, it knows that a cycle has been
taken and deadlock exists. In this case, some process (transaction) will have to die. The sender may
choose to commit suicide or a ring election algorithm may be used to determine an alternate victim

b)

False Deadlocks : False Deadlocks One possible way to prevent false deadlock is to use the
Lamport’s algorithm to provide global timing for the distributed systems. When the coordinator gets a
message that leads to a suspect deadlock: It send everybody a message saying “I just received a message
with a timestamp T which leads to deadlock. If anyone has a message for me with an earlier timestamp,
please send it immediately” When every machine has replied, positively or negatively, the coordinator will
see that the deadlock has really occurred or not.

c)

Diffenence Between Availability and Reliability


Availability is defined as the probability that the system is operating properly when it is requested for
use. In other words, availability is the probability that a system is not failed or undergoing a repair
action when it needs to be used. At first glance, it might seem that if a system has a high availability
then it should also have a high reliability. However, this is not necessarily the case. This article will
explore the relationship between availability and reliability and will also present some of the specified
classifications of availability.

Availability and Reliability


Reliability represents the probability of components, parts and systems to perform their required
functions for a desired period of time without failure in specified environments with a desired
confidence. Reliability, in itself, does not account for any repair actions that may take place. Reliability
accounts for the time that it will take the component, part or system to fail while it is operating. It
does not reflect how long it will take to get the unit under repair back into working condition.

As stated earlier, availability represents the probability that the system is capable of conducting its
required function when it is called upon given that it is not failed or undergoing a repair action.
Therefore, not only is availability a function of reliability, but it is also a function of maintainability.
Table 1 below displays the relationship between reliability, maintainability and availability. Please note
that in this table, an increase in maintainability implies a decrease in the time it takes to perform
maintenance actions
9. Short note:-
a. Primary copy locking.
b. Fragmentation transparency.
c. Network partitioning.
d. Pessimistic approach of concurrency control algorithm.
e. Checkpoints.

Ans:-
a. Primary copy locking:-
The primary copy locking is one type pessimistic concurrency control algorithm. In primary copy locking, One of the
copies (if there are multiple copies) of each lock unit is designated as the primary copy, and it is this copy that has to
be locked for the purpose of accessing that particular unit.
For example, if lock unit x is replicated at sites 1, 2, and 3, one of these sites ( say, 1 ) is selected as the
primary site for x. all transactions desiring access to x obtain their lock at site 1 before they can access a copy of x. If
the database is not replicated (i.e., there is only one copy of each lock unit), the primary copy locking mechanisms
distribute the lock management responsibility among a number of sites.

b. Fragmentation transparency:-
The final form of transparency that needs to be addressed within the context of a distributed database system is that of
fragmentation transparency. Fragmentation transparency the highest degree of transparency. This transparency
consists of the fact that the user or the application programmer works on global relation.
It is commonly desirable to divide each database relation into smaller fragments and treat each fragment as a
separate database object (i.e., another relation). This is commonly done for the reason of performance, availability,
and reliability. Furthermore, fragmentation can reduce the negative effect of replication. Each replica is not the full
relation but only a subset of it; thus less space is required and fewer data items need to be managed.
.
c. Network partitioning:-
Network partitions are due to communication line failures and may cause the loss of messages, depending on the
implementation of the communication subnet. A partitioning is called the simple partitioning if the network is divided
into only two components; otherwise, it is called multiple partitioning.
The termination protocols for network partitioning address the termination of the transactions that were active
in each partition at the time of partitioning. If one can develop non-blocking protocols to terminate these transactions,
it is possible for the sites in each partition to reach a termination decision (for a given transaction) which is consistent
with the sites in other partitions. This would imply that the sites in each partition can continue executing transactions
despite the partitioning. Unfortunately, it isn’t in general possible to find non-blocking termination protocols in the
presence of network partitions. It also means that if network partitioning occurs, we can’t continue normal operations
in all partitions, which limits the availability of the entire distributed database system. However, it is possible to
design non-blocking atomic commit protocols that are resilient to simple partitions. But if multiple partitions occur, it
isn’t possible to design such protocols.

d. Pessimistic approach of concurrency control algorithm:-


Pessimistic approach of concurrency control algorithm synchronizes the concurrent execution of transactions early in
their execution life cycle.
Pessimistic concurrency control

Locking Timestamp ordering Hybrid

Centralized Basic

Primary copy Multi version

Distributed Conservative

1. Locking:-
In locking based approach, the synchronization of transactions is achieved by employing physical or logical locks on
some portion of the database.
i. Centralized:-
One of the sites in network is designated as primary site where the lock table for entire database is stored & is charged
with the responsibility of granting locks to transactions.
ii. Primary copy:-
One of the copies of each lock unit is designated as the primary copy & it is the copy that has to be locked for the
purpose of accessing that particular unit.
iii. Distributed:-
The lock management duty is shared by all the sites of a network. The execution of a transaction involves the
participation & co-ordination of schedulers at more than one site. Each lock scheduler is responsible for the lock unit
local to that site.

2. Timestamp ordering:-
i. Basic:-
The coordinating transaction manager assigns the timestamp to each transaction, determines the sites where each data
item is stored, & sends the relevant operations to that site.
ii. Multi version:-
The update doesn’t modify the database. Each write operation creates a new version of that data item. Each version is
marked by the timestamp of the transaction that creates it.
iii. Conservative:-
The operations of each transactions are buffered until an ordering can be established so that rejections aren’t possible,
& they are executed in that order.

3. Hybrid:-
In some locking based algorithm, timestamps are also used. This is done primarily to improve efficiency & the level
of concurrency. This algorithm are called hybrid.

e. Checkpoints:-
In most of the Local Reliability Protocol, the execution of the recovery action requires searching the entire log. This is
significant overhead because the LRM is trying to find all the transactions that need to be undone & redone. The
overhead can be reduced if it is possible to build a wall which signifies that the database at that point is up-to-date &
consistent. In that case, the redo has to start from the point on & the undo only has to go back to that point. This
process of building the wall is called checkpointing.
Checkpointing is achieved in three steps:-
1. Write a begin_checkpoint record into the log.
2. Collect the checkpoint data into the stable storage.
3. Write an end_checkpoint record into the log.
The 1st & 3rd steps enforce the atomicity of checkpointing operation. If a system failure occurs during checkpointing,
the recovery process will not find an end-checkpoint record & will consider checkpointing not completed.

10. Write short notes on any three of the following:


a) ODBC connectivity
b) Loosely and Tightly coupled system
c) Distributed data dictionary management
d) Three phase commit protocol
Ans. a) ODBC connectivity:-
ODBC is open database connectivity. ODBC is an open standard API to communicate with databases. Open Database
Connectivity (ODBC) is Microsoft's strategic interface for accessing data in a heterogeneous environment of
relational and non- relational database management systems. Based on the Call Level Interface specification of the
SQL Access Group, ODBC provides an open, vendor- neutral way of accessing data stored in a variety of proprietary
personal computer, minicomputer and mainframe databases. ODBC alleviates the need for independent software
vendors and corporate developers to learn multiple application programming interfaces. ODBC now provides a
universal data access interface. With ODBC, application developers can allow an application to concurrently access,
view, and modify data from multiple, diverse databases. ODBC is a core component of Microsoft Windows Open
Services Architecture. Apple has endorsed ODBC as a key enabling technology by announcing support into System 7
in the future. With growing industry support, ODBC is quickly emerging as an important industry standard for data
access for both Windows and Macintosh applications.
The JDBC-ODBC bridge driver enables a Java application to use any database that supports ODBC driver. A Java
application cannot interact directly with the ODBC driver. To use the JDBC-ODBC Bridge driver we need to have the
ODBC driver installed on the client computer. We may require ODBC driver for Microsoft Access database. Note
that we need to set up an ODBC Data Source Name(DSN) on the computer we use to connect to the database. Some
ODBC native code and, many cases native database client code must be loaded on each machine that uses this type of
driver. Hence, this type of driver is generally most appropriate when automatic installation and download of Java
technology application is not important.
To set up an ODBC Data Source Name in windows XP we have to follow these steps:
1. Open the data source administration dialog box from the control panel.

2. Click the user DSN tab.

3. Click the Add button.

4. Select a driver name from the list of ODBC drivers.

5. Enter the needed information in the setup dialog box that appears.

6. Click OK to close the dialog box.

b) Loosely and Tightly coupled system:-


A Loosely-coupled system is based on multiple standalone single or dual processor interconnection via a high
speed communication system. Loosely-coupled systems are physically greater than the Tightly-coupled system.
Nodes in a Loosely- coupled system are usually inexpensive commodity computers and can be recycled as
independent machines upon retirement from the cluster. Loosely-coupled tend to be much less energy efficient than
Tightly-coupled systems. Loosely-coupled system require component databases to construct their own federated
schema.
A Tightly-coupled architecture contains multiple CPU’s that are connected at the bus level. These CPU’s may have
access to a central shared memory, or may participate in a memory hierarchy with both local and shared memory.
Tightly-coupled systems perform better and physically smaller than Loosely-coupled systems, but have historically
required greater initial investments and may depreciate rapidly. Tightly-coupled systems tend to be much more energy
efficient than Loosely-coupled systems. Tightly-coupled systems consists of component systems that use independent
processes to construct and publicize an integrated federated schema.

c) Distributed data dictionary management:-


Involves the distribution of data and work among more than one machine in the network.
Distributed computing is more broad than canonical client/server, in that many machines may be processing work on
behalf of a single client. Distributed computing is generally perceived as a natural
extension of the client/server model. The only conceptual difference is that now the host server
(coordinator) orchestrates the distributed computation with its "peers" rather than doing everything on its own. Any
host server can act as a coordinator, depending on where the request originated.

Operation:
1. User requests data from the local host.
2. The goes out over the network to submit the request for data or service to a remote host.
3. Remote host processes the request and sends the data or the results back to the local host.
4. Local host hands the reply to the client, which is unaware that the request was executed by multiple servers.

Benefits:
1. Placement of data closer its source.
2. Automatic movement of data to where it is most needed.
3. Placement of data closer to the users (through replication).
4. Higher data availability through data replication.
5. Higher fault tolerance through elimination of a single point of failure.
6. Potentially more efficient data access (higher throughput and greater potential for parallelism).
7. Better scalability w.r.t. the application and users' needs.

d) Three phase commit protocol:-

In computer networking and databases, the three-phase commit protocol (3PC) is a distributed
algorithm which lets all nodes in a distributed system agree to commit a transaction. Unlike the two-phase
commit protocol (2PC) however, 3PC is non-blocking. Specifically, 3PC places an upper bound on the
amount of time required before a transaction either commits or aborts. This property ensures that if a given
transaction is attempting to commit via 3PC and holds some resource locks, it will release the locks after
the timeout. The basic observation is that in 2PC, while one site is in the “prepared to commit” state, the
other may be in either the “commit” or the “abort” state. From this analysis, 3PC is developed to avoid such
states and it is thus resilient to such failures.

Protocol description:

In describing the protocol, we use terminology similar to that used in the two-phase commit protocol. Thus
we have a single coordinator site leading the transaction and a set of one or more cohorts being directed
by the coordinator.
Coordinator:
1. The coordinator receives a transaction request. If there is a failure at this point, the coordinator aborts the
transaction (i.e. upon recovery, it will consider the transaction aborted). Otherwise, the coordinator sends
a canCommit? message to the cohorts and moves to the waiting state.
2. If there is a failure, timeout, or if the coordinator receives a No message in the waiting state, the coordinator
aborts the transaction and sends an abort message to all cohorts. Otherwise the coordinator will
receive Yes messages from all cohorts within the time window, so it sends preCommit messages to all cohorts
and moves to the prepared state.
3. If the coordinator fails in the prepared state, it will move to the commit state. However if the coordinator
times out while waiting for an acknowledgement from a cohort, it will abort the transaction. In the case where
all acknowledgements are received, the coordinator moves to the commit state as well.

Cohort:
1. The cohort receives a canCommit? message from the coordinator. If the cohort agrees it sends
a Yes message to the coordinator and moves to the prepared state. Otherwise it sends
a No message and aborts. If there is a failure, it moves to the abort state.
2. In the prepared state, if the cohort receives an abort message from the coordinator, fails, or times
out waiting for a commit, it aborts. If the cohort receives a preCommit message, it sends
an ACK message back and commits.
Disadvantages:
The main disadvantage to this algorithm is that it cannot recover in the event the network is segmented in
any manner. The original 3PC algorithm assumes a fail-stop model, where processes fail by crashing and
crashes can be accuratly detected, and does not work with network partitions or asynchronous
communication.

11. Write the advantages of Distributed DBMS over Centralized DBMS. Give an example of a Bank application,
accessing a database which is distributed over the branches of the bank, in which the relevant predicates for data
distribution are not in the text of the application program. Discuss different levels of distribution transparency.

Ans. Advantages of Distributed DBMS:-


1. Organizational and economic reasons: Many organizations are decentralized, and a distributed
database approach fits more naturally the structure of the organization. The problems of a
distributed organizational structure and of the corresponding information system are the subject of
several books and papers. With the recent developments in computer technology, the economy-of-
scale motivation for having large, centralized computer centers is becoming questionable. We do
not further discuss this subject here; however the organizational and economic motivations are
probably the most important reason for developing distributed databases.

2. Interconnection of existing databases: Distributed databases are the natural solution when several
databases already exist in the organization and the necessity of performing global application
arises. In this case, the distributed database is created bottom-up from the preexisting local
databases. This process may require a certain a degree of local restructuring; however, the effort
which is required by this restructuring is much less than that needed for the creation of a completely
new centralized database.

3. Incremental growth: If an organization grows by adding new, relatively autonomous organizational


units (new branches, new warehouses, etc.), then the distributed database approach supports a
smooth incremental growth with a minimum degree of impact on the already existing units. With a
centralized approach, either the initial dimensions of the system take care of future expansion,
which is difficult to foresee and expensive to implement, or the growth has a major impact not only
on the new applications but also on the existing ones.

4. Reduced communication overhead: In a geographically distributed database, the fact that many
applications are local clearly reduces the communication overhead with respect to a centralized
database. Therefore, the maximization of the locality of applications is one of the primary objectives
in distributed database design.

5. Performance considerations: The existence of several autonomous processors results in the


increase of performance through a high degree of parallelism. This consideration can be applied to
any multiprocessor system, and not only to distributed databases. However distributed databases
have the advantage in that decomposition of data reflects application dependent criteria which
maximize application locality: in this way the mutual interference between different processors is
minimized. The load is shared between the different processors, and critical bottlenecks, such as
the communication network itself or common services of the whole system, are avoided. This effect
is a consequence of the autonomous processing capability requirement for local applications stated
in the definition of distributed databases.
6. Reliability and availability: The distributed database approach , especially with redundant data, can
be used also in order to obtain higher reliability and availability. However, abtaining this goal is not
straightforward and requires the use of techniques which are still not completely understood. The
autonomous processing capability of the different sites does not by itself guarantee a higher overall
reliability of the system, but it ensures a graceful degradation property; in other words, failures in a
distributed database can be more frequent than in a centralized one because of the greater number
of components, but the effect of each failure is confined to those applications which use the data of
the failed site, and complete system crash is rare.
The data of the different branches are distributed on three “backend” computers, which perform the
database management functions. The application programs are executed by a different computer,
which requires database access sevices from the backends when necessary.

Different levels of distribution transparency:-


There are the five levels of distribution transparency
i. Fragmentation transparency: This transparency is the highest degree of the transparency
and consists of the fact that user or application programmer moves on global relations.
ii. Location transparency: This is lower degree of transparency and requires the user or
application programmer to work on fragments however the location of the fragments are not
known.
iii. Local mapping transparency: It allows us to study several problems of distributed database
management without having to take into account the specific data models of local DBMSs .
iv. Replication transparency: It means that the user is unaware of the replication of fragments.
It is implied by location transparency.
v. Naming transparency: A naming system is described as naming transparency when the
result of mapping a name to an object is independent of the host from which the name is
uttered.

You might also like