Alwayson WRT HA and DR in SQL Server
Alwayson WRT HA and DR in SQL Server
com
Praveen Madupu - +91 98661 30093
Sr SQL Server DBA, Dubai
[email protected]
AlwaysOn Availability Groups (AG) in SQL Server are a robust feature for achieving High Availability (HA) and Disaster
Recovery (DR). They provide database-level protection by grouping a set of databases into an availability group and
replicating them to one or more secondary replicas. These replicas can be used for failover in the event of a failure, ensuring
the availability of your critical data.
Let's go into detail on how AlwaysOn fits into HA and DR concepts, covering key features, configurations, and best practices.
In a High Availability setup, AlwaysOn leverages Synchronous Commit mode to ensure that data is committed to both the
Primary Replica and the Secondary Replica at the same time. This guarantees zero data loss because transactions are
committed on both replicas before being confirmed to the client.
● Zero Data Loss: Ensures that no transactions are lost in the event of a primary server failure.
https://www.sqldbachamps.com
●
●
Automatic Failover: If the primary replica fails, the system can automatically failover to the synchronous secondary
replica without manual intervention.
Minimal Downtime: The failover process is near-instantaneous, with minimal impact on availability.
HA Use Case:
Synchronous commit mode is ideal for mission-critical applications where data loss is unacceptable, such as financial
systems or e-commerce websites.
2. Automatic Failover
Automatic failover is the key to high availability. It is enabled only when the secondary replica is configured in Synchronous
Commit mode. During failover, the secondary replica assumes the role of the primary replica without manual intervention.
Failover Types:
● Planned Failover: Manually initiated by an administrator to switch roles between replicas, typically for
maintenance.
● Unplanned Failover: Triggered automatically when the primary replica goes offline unexpectedly due to hardware
failure, software issues, or network outages.
An AlwaysOn Availability Group can be deployed on top of a Windows Server Failover Cluster (WSFC), where quorum is
a critical component to ensure that the system can reliably determine the majority of functioning nodes. Quorum ensures that
only a majority of nodes are allowed to act as the primary replica.
Quorum Models:
● Node Majority: The quorum is based on the number of nodes in the cluster.
● Node and Disk Majority: Involves using a shared disk witness to provide quorum.
● Node and File Share Majority: Uses a file share as a quorum witness, useful for multi-site clusters where disk
sharing is not possible.
https://www.sqldbachamps.com
1. Asynchronous Data Replication
For Disaster Recovery, AlwaysOn supports Asynchronous Commit mode, where data is written to the primary replica
without waiting for acknowledgment from the secondary replica. This configuration prioritizes performance over data
consistency, which can lead to a slight risk of data loss, but it is suitable for DR across geographically distant locations.
● Minimal Latency: The primary replica does not wait for the secondary replica to acknowledge commits, reducing
the overhead on performance.
● Geographically Distributed DR: The secondary replica can be located far away, even in a different region or data
center.
● Potential for Data Loss: There is a possibility of data loss if the primary replica fails before the secondary replica
has synchronized the data.
DR Use Case:
Asynchronous commit mode is useful in environments where some data loss is acceptable but high performance and
geographic distribution are priorities. Examples include global e-commerce platforms, where different data centers are set up
across the world.
https://www.sqldbachamps.com
Praveen Madupu - +91 98661 30093
Sr SQL Server DBA, Dubai
[email protected]
2. Manual Failover for DR
Unlike high availability scenarios, manual failover is typically used in disaster recovery. When the primary replica is
unavailable due to a disaster, administrators can initiate a manual failover to promote the secondary replica, which may be
located at a remote site.
● Controlled Recovery: Allows administrators to carefully assess the situation before initiating failover, ensuring the
right course of action.
● Protection Against Site Failures: With secondary replicas located in remote data centers, disaster recovery can
ensure business continuity even when a complete site goes down.
In a disaster recovery scenario, secondary replicas are often located in different regions or data centers, sometimes in
different countries. This setup is ideal for protection against regional or site-level disasters. Replication across geographically
distant replicas can be done using asynchronous commit mode to avoid latency issues.
4. Backup Offloading
AlwaysOn allows backup offloading to secondary replicas, ensuring that backup operations do not interfere with the primary
https://www.sqldbachamps.com
replica's performance. This feature can be used both for high availability and disaster recovery configurations.
● Reduced Load on Primary: Frees up the primary replica to handle critical read-write operations.
● DR-Ready Backups: Backups can be stored in secondary locations, providing redundancy in case of disaster.
https://www.sqldbachamps.com
Praveen Madupu - +91 98661 30093
Sr SQL Server DBA, Dubai
[email protected]
1. Primary Replica
● The replica that actively handles all read-write transactions. It continuously replicates changes to one or more
secondary replicas.
2. Secondary Replicas
● Synchronous Secondary Replica: Used for high availability. It mirrors the primary in real-time, ensuring data
consistency and zero data loss.
● Asynchronous Secondary Replica: Used for disaster recovery. This replica might have slight delays in data
synchronization, but it provides geographical redundancy.
The Failover Cluster Manager is responsible for detecting failures in the AlwaysOn architecture and initiating automatic
failovers if necessary. It handles the coordination between replicas and ensures that services remain online.
https://www.sqldbachamps.com
4. SQL Server Browser
For client connectivity in AlwaysOn, the SQL Server Browser helps route incoming client requests to the correct SQL
Server instance, whether it’s the primary replica or a readable secondary replica.
3. Geo-Distributed DR Only
https://www.sqldbachamps.com
4. Regularly Test Failover: Simulate both automatic and manual failovers periodically to ensure the system is
prepared for actual failure events.
5. Monitor Quorum and Voting: Ensure that quorum is properly configured, especially in multi-site setups, to avoid
split-brain scenarios.
6. Secure Data Transmission: Use encryption and secure protocols to protect data replication traffic between primary
and secondary replicas, especially in geographically dispersed setups.
Summary:
SQL Server AlwaysOn Availability Groups provide a versatile solution for achieving High Availability and Disaster
Recovery. By leveraging synchronous replication, automatic failover, and geographically distributed replicas, businesses can
ensure data availability, protect against disasters, and optimize resource usage for read and backup operations. Correctly
configuring AlwaysOn for both HA and DR needs is essential to maintaining service continuity and protecting critical data.