0% found this document useful (0 votes)
36 views15 pages

SQL Server HADR Important Points - (Legacy)

Uploaded by

masheed ullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views15 pages

SQL Server HADR Important Points - (Legacy)

Uploaded by

masheed ullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Important points on DB Log shipping in SQL Server:

Here are some important points on Database Log Shipping in SQL Server, covering the essential setup, operation,
and best practices for this high-availability feature:
1. Purpose of Log Shipping
Log shipping is used for disaster recovery by maintaining a standby copy of a primary (source) database on one or
more secondary (destination) servers.
It’s a simple, cost-effective high-availability solution that’s relatively easy to set up and manage.

2. Log Shipping Components


Primary Database: The source database where transactions are recorded.
Secondary Database(s): One or more databases that receive transaction log backups from the primary server.
Monitor Server (Optional): A separate SQL Server instance that monitors the status of the log shipping process and
raises alerts in case of failure.

3. How Log Shipping Works


Backup Job: Backs up the transaction log on the primary database at regular intervals.
Copy Job: Copies the transaction log backup file(s) from the primary server to the secondary server(s).
Restore Job: Restores the copied transaction log backup file(s) on the secondary database to keep it synchronized
with the primary.

4. Secondary Database Modes


Restoring Mode: The secondary database is in a continuous restoring state and isn’t accessible for reading. This
mode is commonly used for disaster recovery.
Standby Mode: The secondary database is accessible for read-only purposes, allowing reporting or read-only queries.
However, it becomes inaccessible when a new transaction log is being restored.

5. Log Shipping Schedules


Log shipping jobs (backup, copy, and restore) can be scheduled independently, offering flexibility in how often data
is transferred and restored.
The frequency of these jobs affects the Recovery Point Objective (RPO). For example, running jobs every 15 minutes
results in a maximum data loss of 15 minutes in a disaster.

6. Setting Up Log Shipping


Enable log shipping through SSMS or T-SQL. The steps involve:
1. Configuring a full backup on the primary database.
2. Setting up transaction log backups, specifying the backup location.
3. Configuring the copy and restore jobs on the secondary server.
SQL Server automatically configures jobs on the primary and secondary servers based on your settings.
7. Failover Process
Manual Failover: Log shipping doesn’t support automatic failover. To fail over to the secondary server, you must:
1. Stop all log shipping jobs.
2. Apply any remaining transaction logs on the secondary.
3. Bring the secondary database online using the WITH RECOVERY option.

Role Reversal: After a failover, you can reconfigure log shipping to make the former secondary the primary, and the
former primary can become a new secondary if needed.

8. Monitoring Log Shipping


SQL Server provides a Log Shipping Monitor to track the health of the log shipping setup.
Alerts can be set up for common issues, such as when a job fails, or a log backup is delayed beyond a specified
threshold.
9. Backup Retention and Cleanup
Define a retention period for transaction log backup files to prevent unnecessary storage use.
Configure automatic cleanup of old log backups on both the primary and secondary servers.

10. Security and Permissions


Ensure that the SQL Server Agent accounts on the primary and secondary servers have read and write permissions
on the backup and copy locations.
Use a secure network for copying log backups to ensure data integrity and security.

11. Impact on Performance


The backup job on the primary can affect performance, particularly if log backups are frequent.
The restore job on the secondary server can cause a delay in read-only availability if the database is in standby
mode.

12. Network and Latency Considerations


Log shipping requires a reliable network connection between the primary and secondary servers, especially when log
backups are frequent.
Network latency can affect the time it takes to copy and restore logs on the secondary server, impacting RPO and
Recovery Time Objective (RTO).
13. Secondary Server Readability for Reporting
If reporting queries are run on the secondary database, use Standby Mode. However, be aware that users will be
disconnected when new logs are restored.
Using log shipping for reporting purposes can reduce the load on the primary server.
14. Limitations of Log Shipping
Single Database Only: Each log shipping configuration only works on a single database; you must set up log shipping
separately for each database.
Manual Failover: Log shipping doesn’t support automatic failover, which is a drawback compared to AlwaysOn
Availability Groups.
Latency: There’s an inherent delay between the primary and secondary database, depending on log backup
frequency and network speed.

15. Testing and Maintenance


Regularly test the failover process to ensure it works smoothly in an actual disaster scenario.
Monitor the backup, copy, and restore jobs to ensure they’re running on schedule. Address any delays immediately
to avoid data loss in the event of a failover.
Log shipping is a robust, reliable solution for disaster recovery in SQL Server, especially when combined with other
high-availability methods.
Important Points on DB Mirroring:
Here are some key points to keep in mind when working with database mirroring in SQL Server:

1. Mirroring Modes
High Safety Mode (Synchronous): Ensures that each transaction is committed on both the principal and mirror
before it’s completed. This provides zero data loss, but it can increase latency.
High Performance Mode (Asynchronous): Transactions are committed on the principal immediately without waiting
for the mirror, resulting in minimal latency but with potential data loss if a failover occurs.
High Safety Mode with Automatic Failover: Requires a third server configured as a witness to enable automatic
failover in High Safety mode. The witness allows SQL Server to detect when the principal server fails and
automatically switches roles with the mirror.

2. Principal, Mirror, and Witness Roles


Principal: The primary server where all database updates occur.
Mirror: A standby server that keeps a copy of the principal database.
Witness: An optional server that enables automatic failover in High Safety mode by helping determine which server
should become principal.

3. Transaction Safety Levels


FULL: Used in High Safety mode, ensuring all transactions are committed on both servers synchronously.
OFF: Used in High Performance mode, where transactions are committed asynchronously.

4. Endpoints and Security


Database mirroring requires configuring database mirroring endpoints on both principal and mirror servers. These
endpoints allow secure communication between the servers.
Encryption (using certificates or Windows Authentication) is often used for secure data transfer over the network.

5. Automatic and Manual Failover


Automatic Failover: Only available in High Safety mode with a witness. If the principal fails, the mirror automatically
becomes the new principal.
Manual Failover: Requires manual intervention and can be performed in both High Safety and High Performance
modes.

6. Monitoring and Alerts


Regularly monitor mirroring status using SQL Server Management Studio (SSMS), system views like
sys.database_mirroring, or the Database Mirroring Monitor.

Set up alerts for events such as synchronization state changes and failures to ensure quick response times.

7. Impact on Performance
Mirroring, especially in synchronous mode, can impact performance due to the overhead of ensuring data
consistency between the principal and mirror.
Synchronous mirroring is suited for databases with lower transaction rates or for databases that can tolerate slight
delays.

8. Backup and Restore Implications


Only the principal database can be backed up in a mirroring setup. The mirror database is in a restoring state and
cannot be directly backed up.
Log backups taken on the principal can be restored to the mirror database if mirroring is removed or interrupted.

9. Witness Server Placement


The witness should ideally be on a separate server, preferably in the same network as the principal and mirror, for
optimal failover performance.
The witness does not need to be a high-spec server since its role is only to facilitate quorum for automatic failover.

10. Quorum Requirement


In a mirroring session with a witness, a quorum is required for automatic failover. This means at least two of the
three servers (principal, mirror, and witness) need to communicate to avoid a “split-brain” scenario, where both
servers act as principal.

11. Application Connectivity


Use the Failover Partner keyword in connection strings to allow applications to automatically connect to the mirror
database if a failover occurs.
This ensures high availability from the application’s perspective without manual intervention.

12. Limitations of Database Mirroring


One Database at a Time: Each mirroring session handles only one database, unlike Availability Groups, which can
handle multiple databases.
Deprecated in SQL Server 2012 and Beyond: Microsoft has deprecated database mirroring in favor of AlwaysOn
Availability Groups, so consider this for future planning.

13. Testing and Maintenance


Test failover and failback periodically to ensure that all components are working correctly.
Regularly update and maintain all servers in the mirroring setup to prevent compatibility issues or unexpected
downtime.

Database mirroring remains a powerful feature for high availability, though it is increasingly being replaced by
Availability Groups in newer SQL Server implementations.

Important Points on DB Replication:


Here are important points on SQL Server Database Replication, covering types, use cases, configuration, and best
practices:
1. Purpose of Replication
SQL Server replication is used to copy and distribute data and database objects from one database to another and
synchronize between databases to maintain consistency.
Common use cases include load balancing, reporting, offline processing, and synchronizing data across
geographically distributed locations.

2. Replication Types
Transactional Replication: Ensures near real-time data synchronization with low latency, ideal for scenarios where
data changes frequently and consistency is critical.
Merge Replication: Allows bidirectional data exchange between databases, used where multiple nodes are allowed
to update data independently and then synchronize.
Snapshot Replication: Takes a full copy of the database at a point in time and sends it to the subscriber. This is useful
for infrequently changing data or when latency isn’t a priority.

3. Replication Components
Publisher: The primary server that owns the data, where changes originate.
Subscriber: The server that receives replicated data from the publisher.
Distributor: Manages the replication process, including storing metadata, tracking changes, and managing replication
agents.
Replication Agents: Includes the Snapshot Agent, Log Reader Agent, Distribution Agent, and Merge Agent, which
perform various tasks like generating snapshots, distributing changes, and merging updates.

4. Replication Models
One-Way (Unidirectional): Data flows from the publisher to one or more subscribers.
Peer-to-Peer: A transactional replication variant where all nodes can update data independently, with changes being
synchronized across nodes. Useful for load balancing and high availability.
Bi-Directional: Allows data changes on both publisher and subscriber but with limitations to avoid conflicts.

5. Latency and Performance


Transactional Replication: Near real-time, low latency.
Merge Replication: Higher latency due to conflict resolution and synchronization.
Snapshot Replication: High latency, as it takes a full snapshot on each synchronization.

6. Conflict Resolution
Merge replication includes conflict resolution mechanisms since changes can be made on multiple nodes. SQL Server
automatically handles conflicts based on predefined rules.
Transactional replication does not require conflict resolution because only the publisher is allowed to modify data.
7. Security and Permissions
SQL Server replication requires proper permissions for agents to read and write data at the publisher, distributor,
and subscriber.
Configure Replication Security Settings carefully to avoid unauthorized access and ensure data integrity.

8. Network and Connectivity Requirements


Replication depends on network reliability and speed. Poor network performance can lead to latency issues,
particularly in transactional replication.
Consider network bandwidth when planning replication topology, especially for large or frequently changing
datasets.

9. Replication Topologies
Central Publisher with Multiple Subscribers: Common for distributing data from a central server to multiple
locations.
Central Subscriber with Multiple Publishers: Used to consolidate data from multiple publishers to a central location.
Peer-to-Peer: All nodes act as both publisher and subscriber, suitable for high-availability setups.

10. Replication Agents and Jobs


Snapshot Agent: Generates a snapshot of the publication and schema, used in all replication types.
Log Reader Agent: Reads changes in the transaction log for transactional replication.
Distribution Agent: Moves changes from the distributor to the subscriber.
Merge Agent: Synchronizes changes in merge replication, resolving conflicts if necessary.

11. Subscriber Types


Push Subscription: Changes are pushed from the distributor to the subscriber, reducing the load on subscribers but
increasing load on the distributor.
Pull Subscription: Subscribers pull changes from the distributor, reducing distributor load but requiring each
subscriber to initiate replication.
12. Schema and Data Considerations
Replication requires compatible schema structures between publisher and subscriber.
Some data types and features (e.g., identity columns, certain constraints) may require special configuration or be
unsupported.

13. Data Consistency and Concurrency Control


Transactional replication ensures strict consistency.
Merge replication uses conflict resolution to manage concurrent updates.
Snapshot replication does not maintain continuous consistency, as it’s only as current as the last snapshot.
14. Monitoring and Troubleshooting
Monitor replication agents using Replication Monitor in SQL Server Management Studio to identify issues with
latency, job failures, and connectivity.
Use replication system tables like MSdistribution_status, MSmerge_conflicts, and MSlogreader_history for insights
into replication health and conflicts.

15. Backup and Restore Implications


Replication settings are not fully backed up with the database. Ensure replication is set up properly after database
restores.
Be cautious with replication in disaster recovery, especially if the distributor or publisher needs to be restored.

16. Maintenance and Cleanup


Periodically clean up the distribution database to avoid it growing excessively.
Monitor agent job history and distribution retention settings to keep replication running smoothly.

17. Failover and Recovery


Replication does not automatically handle failover. If a publisher or distributor fails, manual intervention is needed.
For disaster recovery, ensure replication settings are documented, and test failover scenarios.

18. Replication Limitations

Replication is not intended for automatic failover or high availability; instead, it is best suited for data distribution
and load balancing.
Certain objects, such as system tables and some types of triggers, cannot be replicated directly.

19. Best Practices


Limit Replicated Data: Only replicate necessary tables and columns to reduce overhead.
Avoid Conflicts: In merge replication, minimize the risk of conflicts by segmenting data so that each site primarily
updates its own subset.
Stagger Large Snapshots: For snapshot replication, schedule snapshots during low-usage times to minimize impact.

20. SQL Server Version Compatibility


Ensure all instances involved in replication are compatible in terms of SQL Server versions. Transactional replication
allows for different SQL Server versions but has limitations for backward compatibility.

Replication in SQL Server provides a powerful way to distribute and synchronize data, but it requires careful
planning, configuration, and monitoring to ensure smooth operation and data integrity.
Important points on SQL Server Failover Cluster:
Here are essential points for understanding and configuring SQL Server Failover Cluster Instances (FCI), a high-
availability solution for SQL Server:
1. Overview of Failover Clustering
SQL Server Failover Clustering provides high availability at the instance level by allowing SQL Server instances to
automatically failover to another server (node) in the event of a failure.
FCI requires a Windows Server Failover Cluster (WSFC), with two or more nodes configured to operate as a single
system.

2. Shared Storage Requirement


FCI requires shared storage (e.g., SAN or SMB share) accessible by all nodes in the cluster, where the SQL Server data
and log files are stored.
Only the active node can access the storage at any time. During failover, storage is released by the failing node and
taken over by the new active node.

3. Cluster Resources and SQL Server Role


The SQL Server FCI includes resources like IP address, network name, and shared storage that are managed by WSFC.
When failover occurs, the SQL Server role moves to the new active node, bringing online all necessary resources to
continue operations.

4. Quorum Configuration
The quorum is essential for cluster availability, ensuring that more than half of the nodes (or node plus witness) are
online for the cluster to operate.
Quorum types include Node Majority, Node and Disk Majority, Node and File Share Majority, and Disk Only. Proper
quorum configuration prevents “split-brain” scenarios.

5. Automatic and Manual Failover


FCI supports automatic failover when a node fails, with minimal disruption to clients.
Manual failover can also be initiated for planned maintenance or testing by moving the SQL Server role to another
node.

6. Network and Virtual Name


FCI uses a virtual network name and IP address for client connections. Clients connect to the virtual name, and the
connection is automatically redirected to the current active node.
This virtual name remains the same during failover, so no reconfiguration is needed on the client side.

7. SQL Server Licensing


Only the active SQL Server instance requires a license in a passive failover cluster. However, if nodes are configured
to run SQL Server independently or as active-active (where each node hosts a different SQL Server instance), both
nodes may require licenses.
8. Operating System and Version Compatibility
SQL Server FCI requires a Windows Server Failover Cluster, supported on certain editions of SQL Server (e.g.,
Standard and Enterprise).
All nodes in the cluster should have the same SQL Server version, edition, and patch level to avoid compatibility
issues.

9. Monitoring and Health Checks


WSFC continuously monitors the health of SQL Server and its resources. If the SQL Server service on the active node
fails, WSFC initiates a failover.
Regularly test failover and monitor the Windows Failover Cluster logs and SQL Server logs to ensure high availability.

10. Storage Considerations


FCI requires high-speed, reliable storage since all SQL Server data, logs, and tempdb must be on shared storage.
Implement RAID configurations for storage redundancy and performance, as storage failure would cause the entire
cluster to be unavailable.

11. Instance-Level High Availability


FCI protects the entire SQL Server instance, including databases, logins, agent jobs, and configurations. In contrast,
solutions like AlwaysOn Availability Groups provide database-level protection.
FCI is suited for environments where instance-level high availability is essential and where storage can be shared
across nodes.

12. Limitations of FCI


FCI does not protect against storage failures because shared storage is a single point of failure. Implementing storage
redundancy is critical.
There is no automatic recovery from storage issues; if storage fails, the entire FCI will be down until storage is
restored.
13. Data Backup and Recovery
FCI ensures high availability but doesn’t eliminate the need for regular backups. Implement a backup strategy for all
databases in the cluster.
Backups should be stored in a location accessible by all nodes or on a network share to ensure they can be accessed
during failover.

14. Testing Failover and Recovery Procedures


Regularly test failovers to ensure all resources, including network configurations and storage connections, are
working correctly.
Simulate failover scenarios to verify that both the failover and failback processes are seamless.

15. SQL Server Agent Jobs and Alerts


SQL Server Agent jobs are part of the instance and failover along with it, ensuring they remain operational on the
new active node.
Configure alerts and notifications to report on failovers and other critical events to facilitate proactive management.

16. Maintenance and Patching


Use rolling updates for SQL Server FCI. You can update one node while keeping the other node online, minimizing
downtime.
During maintenance, shift SQL Server roles to another node, apply patches, and then failback. Repeat this for each
node until all are up-to-date.

17. Client Connection and Timeout Configuration


FCI failover causes a brief disruption in client connections. Configure client applications with retry logic or adjust the
connection timeout to handle short failover delays.
Clients should connect using the virtual name to avoid reconfigurations post-failover.

18. Logging and Troubleshooting


Check logs for both SQL Server and Windows Failover Cluster Manager when troubleshooting issues. Cluster logs
provide insights into failover events and resource health.
Use tools like Cluster Validation Wizard to verify FCI setup and ensure it meets best practices for reliability and
performance.

19. SQL Server TempDB Configuration


In recent SQL Server versions, tempdb can be configured on local storage for performance reasons, rather than
shared storage, since tempdb can be recreated on each node during failover.
20. Integration with Disaster Recovery Solutions
Combine FCI with AlwaysOn Availability Groups for disaster recovery across data centers. This setup provides high
availability at the instance level within a data center (FCI) and disaster recovery at the database level across data
centers (AG).
Other options include Database Mirroring or Log Shipping to replicate the FCI instance to an offsite location.
SQL Server Failover Cluster Instances provide robust instance-level high availability within a single data center but
require careful planning, testing, and ongoing monitoring to ensure resilience and minimize downtime.

Important points on SQL Server Alwayson:


SQL Server AlwaysOn offers high availability and disaster recovery solutions that enhance database availability across
multiple servers and locations. Here are the key points regarding SQL Server AlwaysOn:
1. AlwaysOn Availability Groups (AG) Overview
AlwaysOn Availability Groups provide database-level high availability and disaster recovery for a set of databases
(AG) that failover together.
It allows for multiple secondary replicas that can be in the same data center or distributed across different
geographic locations.
Unlike Failover Cluster Instances (FCI), AGs work at the database level rather than the instance level, providing
flexibility in protecting specific databases.

2. Components of AlwaysOn Availability Groups


Primary Replica: The main instance where the data is read-write and client applications connect for write operations.
Secondary Replica(s): Additional copies of the primary database; can be configured as synchronous (for automatic
failover) or asynchronous (for disaster recovery).
Listener: A virtual network name that enables seamless client redirection to the current primary replica.

3. Synchronous and Asynchronous Modes


Synchronous Mode: Ensures data consistency by committing transactions on both the primary and secondary
replicas, suitable for high-availability scenarios with low latency.
Asynchronous Mode: Replication is delayed, and data may not be up-to-date. This mode is best for disaster recovery
across geographically distant locations.
4. Automatic and Manual Failover
Automatic Failover: Available only in synchronous mode with at least one secondary replica configured for automatic
failover. Provides minimal downtime.
Manual Failover: Can be initiated by a DBA, allowing for controlled switchover during maintenance or testing. Works
with both synchronous and asynchronous modes.

5. Read-Only Secondary Replicas


Secondary replicas can be configured as read-only, allowing for reporting, analytics, and backup operations off the
primary, reducing its workload.
Read-intent routing can be configured so that specific read-only queries are automatically redirected to the
secondary replica(s).

6. Automatic Page Repair


AlwaysOn AG automatically repairs corrupted pages by copying clean versions from the primary replica to secondary
replicas, enhancing data integrity and resilience.

7. Backup on Secondary Replicas


Backups can be taken from secondary replicas to offload the primary, which improves performance on the primary
instance.
Supported backups include full, copy-only, and transaction log backups, though differential backups must be taken
on the primary.

8. Quorum Configuration in WSFC


AlwaysOn AG relies on Windows Server Failover Clustering (WSFC) to monitor cluster health and manage failover.
Quorum is critical in WSFC, ensuring that the cluster has a consensus on which node should be active. Configure the
quorum mode based on the number of nodes and network topology.

9. Availability Group Listener


The AG listener is a virtual network name that allows clients to connect to the current primary replica without
changing connection strings.
Listeners enable seamless failover by redirecting clients to the new primary replica after failover.

10. Multi-Subnet Support for DR


AlwaysOn AG supports multi-subnet configurations, enabling disaster recovery across geographically separate data
centers.
This configuration provides cross-site failover capabilities by adjusting client and DNS settings for quick redirection
during failover.
11. Failover Policy and Health Detection
The failover policy can be customized to handle health detection of replicas, such as session timeouts, database
health, and WSFC node health.
SQL Server checks replica health periodically and triggers failover based on the configured policy.

12. Database Requirements for AGs


All databases in an AG must be in full recovery model to support transaction log shipping between primary and
secondary replicas.
The database names and structures must match on all replicas to ensure consistency.

13. Supported SQL Server Editions


AlwaysOn AGs are available in SQL Server Enterprise Edition for up to eight secondary replicas and Standard Edition
with more limited features and a maximum of two nodes.

14. Monitoring and Alerts


Use the AlwaysOn Dashboard in SQL Server Management Studio to monitor the health and status of AGs.
SQL Server provides DMVs (Dynamic Management Views) to monitor replica health, synchronization status, failover
events, and backup preferences.
15. Client Connection and Connection Resiliency
Use the MultiSubnetFailover=True parameter in connection strings for multi-subnet deployments to enable faster
reconnections.
Clients must implement connection retry logic to handle transient connection issues during failover events.

16. Data Synchronization Impact on Performance


Synchronous replicas can add latency due to the overhead of transaction synchronization; configure asynchronous
replicas for disaster recovery to avoid performance impact.
Consider the network bandwidth and latency when setting up replicas in distant locations, especially for
synchronous configurations.

17. Maintenance and Patching


Use rolling upgrades to patch nodes individually without taking the entire AG offline. Failover to a secondary, patch
the primary, and repeat for each replica.
Test failover before and after patching to verify that failover functionality remains intact.

18. Licensing Considerations


Each server hosting a replica requires a SQL Server license. Secondary replicas used solely for high availability may be
eligible for license discounts, depending on Microsoft licensing terms.
19. Failover Groups for Geo-Replication
For cross-region DR, consider Failover Groups (available in Azure SQL Database Managed Instance) to manage
groups of databases with automatic failover across regions.

20. Limitations of AlwaysOn AGs


Instance-Level Configurations like logins, linked servers, and SQL Agent jobs aren’t replicated. These must be
manually created or synchronized on all replicas.
Certain features, such as FILESTREAM and cross-database transactions, have limited support in AGs.

SQL Server AlwaysOn Availability Groups are a powerful solution for high availability and disaster recovery, providing
granular control and flexibility for managing critical databases across nodes and locations.

You might also like