Adv N Disadv of Replication
Adv N Disadv of Replication
Adv N Disadv of Replication
INTRODUCTION
Data Replication is the process of storing data in more than one site or node. This is
necessary for improving the availability of data.
Availability
If one of the sites containing relation R fails, then the relation R can be obtained from
another site. Thus, queries (involving relation R) can be continued to be processed in spite of
the failure of one site.
Increased parallelism
The sites containing relation R can process queries (involving relation R) in parallel
This leads to faster query execution.
We have many copies of same data in several different locations (usually different
geographical locations). Hence, failure of any sites (servers) will not affect the transactions.
Queries requesting replicated copies of data are always faster (especially read queries)
Distributed database ensures the availability of data where it is needed much. In case
of replication, this is one step ahead. Yes, the complete table itself loaded locally. Hence,
those queries can be answered quickly from the local site where they are initiated.
1
Less communication overhead
When more number of read queries is generated in a site, all of them can be answered
locally. Only the queries involving different table or the queries try to write something need
to use the communication links to contact other sites.
The more replicas of, a relation are there, the greater are the chances that the required
data is found where the transaction is executing. Hence, data replication reduces movement
of data among sites and increases speed of processing.
Replication offers various benefits depending on the type of replication and the
options one choose, but the common benefit of replication is the availability of
data when and where it is needed.
Allowing multiple sites to keep copies of the same data. This is useful when
multiple sites need to read the same data or need separate servers for reporting
applications.
Allowing greater autonomy. Users can work with copies of data while
disconnected and then propagate changes they make to other databases when they
are connected.
Bringing data closer to individuals or groups. This helps to reduce conflicts based
on multiple user data modifications and queries because data can be distributed
2
throughout the network, and one can partition data based on the needs of different
business units or users.
Other choices in SQL Server 2000 include log shipping and failover clustering,
which provide copies of data in case of server failure.
Its a backup for disaster recovery. If the primary site is hit with a natural disaster,
power outage, fire, etc, the replicated database in a secondary location can be
utilized to prevent system downtime.
Storing replicas of same data at different sites consumes more disk space.
Replication would mean to duplicate any tables and store them in every site. This
need more space in every site.
When an update is required, a database system must ensure that all replicas are
updated. If we have more copies of same data loaded in different sites, obviously we need to
update all the replicas whenever we would like to change data. Hence, write operation is
always costly.
Expensive
3
Concurrency control and recovery techniques will be more advanced and hence more
expensive. In general, replication enhances the performance of read operations and increases
the availability of data to read-only transactions. However, update transactions incur greater
overhead. Controlling concurrent updates by several translations to replicated data is more
complex than is using the centralized approach to concurrency control.
PROCESS
If the two sites update completely different tables, then something like Dropbox might
work for that. Dropbox does not synchronize/merge the contents of files. That means if both
site A and site B updated some file, then one would be responsible for writing the code to
merge the changes. Advantage Database Server has support for replication built in natively,
so that would likely be the simplest solution. Advantage replication is performed on a record-
by-record basis and is handled asynchronously. If the target database cannot be reached, the
updates are stored in a queue and processed periodically.
If the connection between the two sites is open / available constantly, the lag between
the source update and the replicated update is typically small but obviously depends on the
network bandwidth and latency. One could use a VPN for the connection between the two
sites, but it would not be required. If one does not use some kind of VPN, though, one should
make sure the communication is encrypted between the two sites (it is an option when setting
up the subscriptions). Edit For the communication, all one need is "normal" network
connectivity. The primary issue is dealing with things like firewalls and NAT.
With Advantage, one defines which port it uses. If one uses a TCP/IP connection, one
would need to make sure the configured port allows inbound connections to the ads.exe
process. One can use UDP as well, but if one is dealing with firewalls, it is probably going to
be simpler with TCP. Ones question about duplicate keys is a good one. If both sites either
add a record with the same primary key or update the same record concurrently, then it results
4
in a conflict. There is an option to simply ignore conflicts in which case the last update wins.
More realistically, one would want to write an ON CONFLICT trigger to handle the conflicts.
CONCLUSION
There can be full replication, in which a copy of the whole database is stored at every
site. There can also be partial replication, in which case, some fragment (important
frequently used fragments) of the database are replicated and others are not replicated. There
are a number of advantages and disadvantages to replication. Data replication is the process
where in a relation (a table) or portion of a relation (a fragment of a table) is duplicated and
those duplicated copies are stored in multiple sites (servers) to increase the availability of
data.
REFERENCES
http://ecomputernotes.com/database-system/adv-database/data-replication
http://exploredatabase.blogspot.in/2014/08/advantages-and-disadvantages-of-data-
replication-in-distributed-databases.html
https://answers.yahoo.com/question/index?qid=20061117132416AAGSyKo
http://stackoverflow.com/questions/4698943/advantage-database-replication