Basics of Replication SQL Server 2000
Basics of Replication SQL Server 2000
Table of Contents:
Replication: SQL Server 2000 - Part 1
Replication Benefits
SQL Server Platform for Replication
Entities for the SQL Server Replication Model
Entities Further Explained...
Implementing Replication
Implementing Replication, Cont'd
Here we illustrate what Replication is, its features and benefits, and how you can put
it to use.
Database management systems are among the most important software systems
driving the information age. In many Internet applications, a large number of users
who are geographically dispersed may routinely query and update the same
database. In this environment, the location of the data can have a significant impact
on application response time and availability. A centralized approach manages only
one copy of the database. The centralized approach suffers from two major
drawbacks:
Users working in different geographic locations can work with their local copy of data
thus allowing greater autonomy.
Microsoft SQL server uses publishing industry model to represent the components
and processes in replication architecture. Publishing industry publishes
Magazines/Books; there are Distributors and Agents who carry these publications to
the Subscribers. Subscribers of the magazine obtain copies of the publication and
read the articles of interest to them; this is how the SQL Server Replication model
works. Figure 1 depicts the typical Publishing industry flow.
Figure 1
Based on the above model we can identify the following Entities for the SQL Server
replication model.
• Publisher
• Distributor
• Agent
• Subscriber
• Articles
• Publications
• Subscriptions
Publisher
Publisher is a server that makes the data available for subscription to other servers.
In addition to making data available for replication, a publisher also identifies what
data has changed at the subscriber during the synchronizing process. Depending on
the type of replication, changed data is identified at different instances. We will learn
more about Replication types in the Replication Types section.
Distributor
Distributor maintains the Distribution Database. The role of the distributor varies
depending on the type of replication. Two types of Distributors are identified:
Remote distributor and Local distributor. Remote distributor is separate from
publisher and is configured as a distributor for replication. Local distributor is a
server that is configured as a publisher and a distributor.
Agents
Agents are the processes that are responsible for copying and distributing data
between Publisher and subscriber. There are different types of Agents supporting
different replication types.
Subscriber
Subscriber is a server that receives and maintains the published data. Modifications
to the data at the subscriber-level can be propagated back to the publisher; in some
cases Subscriber may re-publish the data to the other subscribers.
Articles
An article can be any database object, viz. Tables (Column filtered or Row filtered),
Views, Indexed views, Stored Procedures, User defined functions.
Publication
Subscriptions
Subscription Types
With Push subscription the publisher is responsible for synchronizing all the changes
to the subscriber without the subscriber asking for those changes.
With Pull subscription the subscriber initiates the replication instead of the publisher.
Replication Types
• Snapshot Replication
• Transactional Replication
• Merge Replication
Snapshot Replication
• The changes to data at the subscriber are not updated to the subscriber
continuously
• Subscribers are updated with complete modified data and not by individual
transactions
• Propagating the changes to the subscribers takes more time as it is a one
time process or scheduled process.
Following are some of the scenarios where snapshot replication fits in ideally:
Transactional Replication
Transactional replication is also known as dynamic replication. In transactional
replication, modifications to the publication at the publisher are propagated to the
subscriber incrementally.
Merge Replication
• Updates to the data are made independently at more than one server.
• Data is merged on a scheduled basis or on demand.
• Allows users to work online/offline and synchronize the publisher and
subscriber on a scheduled basis or on demand.
With the above basic knowledge we can now proceed to understand the
implementation of replication. There are different ways by which you can implement
and monitor replication based on different replication types. But in general
replication has the following general steps:
• Configuring replication
• Generating and applying initial snapshot
• Modifying replicated data
• Synchronizing and propagating data
Configuring Replication:
1. Configure the publisher and distributor. Distributor can be on the same server
or even on a different server
2. Create publications based on data, sub sets of data and database objects
3. Determine the type of replication to use, the subscriber database and location
of the snapshot file
4. Configure when the synchronization will occur and options that will be used
with publications
5. Create push and/or pull subscriptions at either the publisher or the subscriber
and configure your replication schedule and options
SQL server 2000 creates a snapshot of data and schema and saves it in the snapshot
file location. After the subscription is created, the snapshot is applied, and is based
on a configured schedule. Creating a publication or a snapshot can be applied
manually. The snapshot agent is responsible for creating the snapshot file and
stores it in the snapshot file location.
Depending on the type of replication and replication options, the subscriber will be
able to modify the data after the snapshot has been applied and propagate the
changes back to the publisher or other subscribers.
Special consideration should be taken for some of the data types and properties
during replication. These Data types and properties are:
For example, you can set the Identity range of 1 to 500 for Publication ‘A’ at
Publisher and 501 to 1000 for the same publication at Subscriber, with a threshold of
80%.
In this case, a newly inserted row at publisher will have Identity from 1 to 500 and a
newly inserted row at subscriber will have identity from 501 to 1000.
When the threshold has reached 80%, a new identity range is used for the next
inserts. In this case, if the identity value reaches 400 at the Publisher any new
inserts after that will use the new identity range from 1001 to 1500. Similarly, if the
Subscriber threshold reaches 800, any new inserts after that will have Identity range
from 1501 to 2000.
The threshold value should be set carefully by evaluating the frequency of updates at
the subscriber and synchronization schedule. Setting the threshold to a lower value
will result in many unused Identity values.
The following system-stored procedures can also be used to set Identity range
explicitly:
o Sp_adjustpublisheridentityrange
o Sp_addmergearticle
• Use NOT FOR REPLICATION option when defining Identity columns
Identity ranges can also be managed by defining check constraint and the NOT FOR
REPLCIATION option on Identity column. When an identity column is specified as
NOT FOR REPLCATION, then its range should be provided programmatically. When
this option is set, SQL server retains the original values set by the replication agent
but continues to increment the value of the Identity column in a normal value, i.e.
without resetting the Identity value.