Unit-4 Ch-12
Unit-4 Ch-12
however, other factors such as performance and failure tolerance often require the
use of data replication techniques similar to those in distributed databases.
• The increased focus on mobile business intelligence. More and more companies are
acceptance mobile technologies within their business plans. As companies use social
networks to get closer to customers, the need for on-the-spot decision making
increases. Although a data warehouse is not usually a distributed database, it does
rely on techniques such as data replication and distributed queries that facilitate data
extraction and integration.
At this point, the long-term impact of the Internet and the mobile revolution on
distributed database design and management is just starting to be felt. Perhaps the
success of the Internet and mobile technologies will foster the use of distributed
databases as bandwidth becomes a less troublesome bottleneck.
In any case, distributed database concepts and components are likely to find a place
in future database development, particularly for specialized mobile and location-
aware applications.
The distributed database is especially desirable because centralized database
management is subject to problems such as:
• Performance degradation because of a growing number of remote locations over
greater distances.
• High costs associated with maintaining and operating large central (mainframe)
database systems and physical infrastructure.
• Reliability problems created by dependence on a central site (single point of failure
syndrome) and the need for data replication.
• Scalability problems associated with the physical limits imposed by a single
location, such as physical space, temperature conditioning, and power consumption.
• Organizational rigidity executed by the database, which means it might not support
the flexibility and agility required by modern global organizations.
Transaction Transparency
Transaction transparency is a DDBMS property that ensures database
transactions will maintain the distributed database’s integrity and consistency.
Remember that a DDBMS database transaction can update data stored in many
different computers connected in a network. Transaction transparency ensures
that the transaction will be completed only when all database sites involved in
the transaction complete their part of the transaction.
Distributed database systems require complex mechanisms to manage transactions
and ensure the database’s consistency and integrity.
Distributed Request and Distributed transaction
Whether or not a transaction is distributed, it is formed by one or more database
requests. The basic difference between a nondistributed transaction and a
distributed transaction is that the latter can update or request data from several
different remote sites on a network.
A distributed transaction can reference several different local or remote DP sites.
Although each single request can reference only one local or remote DP site, the
transaction as a whole can reference multiple DP sites because each request can
reference a different site.
A distributed request lets a single SQL statement reference data located at several
different local or remote DP sites. Because each request (SQL statement) can access
data from more than one local or remote DP site, a transaction can access several
sites. The ability to execute a distributed request provides fully distributed database
processing because you can:
• Partition a database table into several fragments.
• Reference one or more of those fragments with only one request. In other words,
there is fragmentation transparency.
• CPU time cost associated with the processing overhead of managing distributed
transactions.
Although costs are often classified either as communication or processing costs, it is
difficult to separate the two. Not all query optimization algorithms use the same
parameters, and not all algorithms assign the same weight to each parameter. For
example, some algorithms minimize total time, others minimize the communication
time, and still others do not factor in the CPU time, considering its cost insignificant
relative to other costs.
A centralized database evaluates every data request to find the most-efficient way to
access the data. This is a reasonable requirement, considering that all data are locally
stored and all active transactions, working on the data are known to the central DBMS.
In contrast, in a DDBMS, transactions are distributed among multiple nodes;
therefore, determining what data are being used becomes more complex. Hence,
resolving data requests in a distributed data environment must take the
following points into consideration:
• Data distribution. In a DDBMS, query translation is more complicated because the
DDBMS must decide which fragment to access. (Distribution transparency was
explained earlier in this chapter.) In this case, a TP executing a query must choose
what fragments to access, create multiple data requests to the chosen remote DPs,
combine the DP responses, and present the data to the application.
• Data replication. In addition, the data may also be virtual at several different sites.
The data replication makes the access problem even more complex because the
database must ensure that all copies of the data are consistent. Therefore, an
important characteristic of query optimization in distributed database systems is that
it must provide replica transparency. Replica transparency refers to the DDBMS’s
ability to hide multiple copies of data from the user. This ability is particularly