0% found this document useful (0 votes)
151 views

Databases in A Distributed Environment

This document discusses databases in a distributed environment. It describes centralized databases where remote users access a central database. It then covers issues with data currency in a distributed system and how database locks can help achieve data currency. Distributed databases can be partitioned, with segments distributed to primary users, or replicated, with the entire database replicated at each site. The document discusses challenges like deadlocks that can occur in distributed systems and methods for resolving them like terminating transactions. It also addresses ensuring data concurrency across replicated databases.

Uploaded by

levi ackerman
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
151 views

Databases in A Distributed Environment

This document discusses databases in a distributed environment. It describes centralized databases where remote users access a central database. It then covers issues with data currency in a distributed system and how database locks can help achieve data currency. Distributed databases can be partitioned, with segments distributed to primary users, or replicated, with the entire database replicated at each site. The document discusses challenges like deadlocks that can occur in distributed systems and methods for resolving them like terminating transactions. It also addresses ensuring data concurrency across replicated databases.

Uploaded by

levi ackerman
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Databases in a Distributed Environment

Chapter 1 introduced the concept of distributed data processing (DDP) as an alternative to the centralized
approach. Most modern organizations use some form of distributed processing and networking to process their
transactions. Some companies process all of their transactions in this way. An important consideration in
planning a distributed system is the location of the organization’s database. In addressing this issue, the planner
has two basic options: databases can be centralized, or they can be distributed. Distributed databases fall into
two categories: partitioned and replicated databases. This section examines issues, features, and trade-offs
that should be carefully evaluated in deciding how databases should be distributed.

CENTRALIZED DATABASES
Under the centralized database approach, remote users send requests via terminals for data to the central site,
which processes the requests and transmits the data back to the user. The central site performs the functions of a
file manager that services the data needs of the remote users. The centralized database approach is illustrated in
Figure 9-24.
Earlier in the chapter, three primary advantages of the database approach were presented: the reduction of data
storage costs, the elimination of multiple update procedures, and the establishment of data currency (that
is, the firm’s data files reflect accurately the effects of its transactions). Achieving data currency is critical to
database integrity and reliability. However, in the DDP environment, this can be a challenging task.
Data Currency in a DDP Environment
During data processing, account balances pass through a state of temporary inconsistency, in which their values
are incorrectly stated. This occurs during the execution of any accounting transaction. To illustrate, the
computer logic for recording the credit sale of $2,000 to customer Jones.

Immediately after the execution of Instruction Number 3, and before the execution of Instruction Number 4,
the AR-Control account value is temporarily inconsistent by the sum of $2,000. This inconsistency is resolved
only after the completion of the entire transaction. In a DDP environment, such temporary inconsistencies can
result in the permanent corruption of the database. To illustrate the potential for damage, look at a slightly more
complicated example. Using the same computer logic as before, consider the processing of two separate
transactions from two remote sites: Transaction 1 (T1) is the sale of $2,000 on account to customer Jones from
Site A; Transaction 2 (T2) is the sale of $1,000 on account to customer Smith from Site B. The following logic
shows the possible interweaving of the two processing tasks and the effect on data currency.
Notice that Site B seized the AR-Control data value of $10,000 when it was in an inconsistent state. By using
this value to process its transaction, Site B effectively destroyed the record of Transaction T1 that had been
processed by Site A. Therefore, instead of $13,000, the new AR-Control balance is misstated at $11,000.
Database Lockout
To achieve data currency, simultaneous access to individual data elements by multiple sites needs to be
prevented. The solution to this problem is to use a database lockout, which is a software control (usually a
function of the DBMS) that prevents multiple simultaneous accesses to data. The previous example can be used
to illustrate this technique: immediately upon receiving the access request from Site A for ARControl
(T1, Instruction Number 2), the central site DBMS places a lock on AR-Control to prevent access from other
sites until Transaction T1 is complete. Thus, when Site B requests AR-Control (T2, Instruction Number 2), it is
placed on wait status until the lock is removed. Only then can Site B access AR-Control and complete
Transaction T2.

DISTRIBUTED DATABASES
Distributed databases can be distributed using either the partitioned or replicated technique.
Partitioned Databases
The partitioned database approach splits the central database into segments or partitions that are distributed to
their primary users. The advantages of this approach are:
_ Storing data at local sites increases users’ control.
_ Permitting local access to data and reducing the volume of data that must be transmitted between sites
improves transaction processing response time.
_ Partitioned databases can reduce the potential for disaster.

By having data located at several sites, the loss of a single site cannot terminate all data processing by the
organization. The partitioned approach, which is illustrated in Figure 9-25, works best for organizations that
require
minimal data sharing among users at remote sites. To the extent that remote users share common data, the
problems associated with the centralized approach still apply. The primary user must now manage
requests for data from other sites. Selecting the optimum host location for the partitions will minimize data
access problems. This requires an in-depth analysis of end-user data needs.

THE DEADLOCK PHENOMENON. In a distributed environment, it is possible that multiple sites will lock out
each other, thus preventing each from processing its transactions. For example, Figure 9-26
illustrates three sites and their mutual data needs. Notice that Site 1 has requested (and locked) Data A and is
waiting for the removal of the lock on Data C to complete its transaction. Site 2 has a lock on C and is waiting
for E. Finally, Site 3 has a lock on E and is waiting for A. A deadlock occurs here because there is mutual
exclusion to data, and the transactions are in a wait state until the locks are removed. This
can result in transactions being incompletely processed and corruption of the database. A deadlock is a
permanent condition that must be resolved by special software that analyzes each deadlock condition to
determine the best solution. Because of the implications for transaction processing, accountants should be
aware of the issues pertaining to deadlock resolutions.

DEADLOCK RESOLUTION. Resolving a deadlock usually involves sacrificing one or more


transactions.These must be terminated to complete the processing of the other transactions in the deadlock. The
preempted transactions must then be reinitiated. In preempting transactions, the deadlock resolution software
attempts to minimize the total cost of breaking the deadlock. Although not an easy task to automate,some of the
factors that influence this decision are as follows:
1. The resources currently invested in the transaction. This may be measured by the number of updates that the
transaction has already performed and that must be repeated if the transaction is terminated.
2. The transaction’s stage of completion. In general, deadlock resolution software will avoid terminating
transactions that are close to completion.
3. The number of deadlocks associated with the transaction. Because terminating the transaction breaks all
deadlock involvement, the software should attempt to terminate transactions that are part of more
than one deadlock.

Replicated Databases
In some organizations, the entire database is replicated at each site. Replicated databases are effective In
companies in which there exists a high degree of data sharing but no primary user. Because common data are
replicated at each site, the data traffic between sites is reduced considerably. Figure 9-27 illustrates the
replicated database model.
The primary justification for a replicated database is to support read-only queries. With data replicated
at every site, data access for query purposes is ensured, and lockouts and delays because of network traffic are
minimized. A problem arises, however, when local sites also need to update the replicated database with
transactions.
Because each site processes only its local transactions, different transactions will update the common
data attributes that are replicated at each site and, thus, each site will possess uniquely different values after the
respective updates. Using the data from the earlier example, Figure 9-28 illustrates the effect of processing
credit sales for Jones at Site A and Smith at Site B. After the transactions are processed, the
value shown for the common AR-Control account is inconsistent ($12,000 at Site A and $11,000 at Site
B) and incorrect at both sites.

Concurrency Control
Database concurrency is the presence of complete and accurate data at all remote sites. System designers need
to employ methods to ensure that transactions processed at each site are accurately reflected in the databases at
all other sites. This task, while problematic, has implications for accounting records and is a matter of concern
for accountants.
A commonly used method for concurrency control is to serialize transactions. This involves labeling each
transaction by two criteria. First, special software groups transactions into classes to identify potential conflicts.
For example, read-only (query) transactions do not conflict with other classes of transactions.
Similarly, AP and AR transactions are not likely to use the same data and do not conflict. However, multiple
sales order transactions involving both read and write operations will potentially conflict.

The second part of the control process is to time stamp each transaction. A system-wide clock is used to keep all
sites, some of which may be in different time zones, on the same logical time. Each time stamp
is made unique by incorporating the site’s identification number. When transactions are received at each
site, they are examined first for potential conflicts. If conflicts exist, the transactions are entered into a
serialization schedule. An algorithm is used to schedule updates to the database based on the transaction time
stamp and class. This method permits multiple interleaved transactions to be processed at each site
as if they were serial events.
FIGURE
9-27 REPDistributed Databases and the Accountant
The decision to distribute databases is one that should be entered into thoughtfully. There are many issues and
trade-offs to consider. Some of the most basic questions to be addressed are:
_ Should the organization’s data be centralized or distributed?
_ If data distribution is desirable, should the databases be replicated or partitioned?
_ If replicated, should the databases be totally replicated or partially replicated?
_ If the database is to be partitioned, how should the data segments be allocated among the sites?
The choices involved in each of these questions impact the organization’s ability to maintain database integrity.
The preservation of audit trails and the accuracy of accounting records are key concerns. Clearly, these are
decisions that the modern accountant should understand and influence intelligently.

You might also like