0% found this document useful (0 votes)
77 views34 pages

Coeo SQL Server 2008 High Availability

The document discusses various business continuity solutions for SQL Server including failover clustering, database mirroring, log shipping, and replication. It provides details on the fundamentals, automatic vs manual failover capabilities, data loss potential, and scope of each solution. Combining solutions like database mirroring with failover clustering can provide a fault-tolerant configuration for critical database systems.

Uploaded by

deepakshiva
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views34 pages

Coeo SQL Server 2008 High Availability

The document discusses various business continuity solutions for SQL Server including failover clustering, database mirroring, log shipping, and replication. It provides details on the fundamentals, automatic vs manual failover capabilities, data loss potential, and scope of each solution. Combining solutions like database mirroring with failover clustering can provide a fault-tolerant configuration for critical database systems.

Uploaded by

deepakshiva
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Justin Langford Principal Consultant

The business continuity problem Cold/ Warm/ Hot Standby Solutions Failover Clustering Database Mirroring Combining Availability Technologies

Fundamentals No failover, potential data loss


Backup and Restore Third party solutions: storage snapshots, VDI Detach/ Attach

Distributed Data Manual failover, possible data loss


Log Shipping Peer-to-peer replication Database Mirroring (High Performance mode) Data Mirroring (third party hardware/ software mirroring

High Availability Automatic failover, no data loss


Database Mirroring (High Availability mode) Failover clustering Multi-site (Geo) clusters

Operator

Majority of downtime attributed to System Admins Application Software can improve fault tolerance System Software can help at all levels

Application Software System Software Hardware

Commodity hardware now very reliable

Evaluating business continuity solutions


Automatic or Manual Detection Automatic or Manual Failover Failover time Number of failures survived Data Currency / Loss Data Consistency Granularity: Instance, Database, Table, Page, Row Cost - hardware, network, additional management Complexity Recovery Point Interval/ Recovery Time Interval
we need zero downtime, zero data loss

Transparency to clients

The business continuity problem Cold/ Warm/ Hot Standby Solutions Failover Clustering Database Mirroring Combining Availability Technologies

Backup / Restore

Backup/ Restore & Detach/ Attach

Manual detection and failover Potential for data loss Clients must be redirected Slowest failover most downtime Backup / Restore Log backups allow point in time restore
Detach / Copy / Attach

Detach / Copy / Attach Copies entire files No rolling forward logs

Replication

Replication and Log Shipping


Manual detection and failover Replication since SQL Server 6.0
Primarily used where availability is required in conjunction with scale out of read activity Failover requires custom solution Could define subset of source database Latency between source and copy can be seconds

Log Shipping Log Shipping

Basic idea: Backup, Copy, Restore Log will always be supported Database scope Database accessible but read-only Users must exit for next log to be applied

Peer-to-Peer Transactional Replication

Peer-to-Peer Transactional Replication

SQL Server 2005 introduced bi-directional transactional replication


Enables scale-out and improved availability Warm / hot standby
Possibility of data loss on failure

SQL Server 2008 improvements


Conflict detection
Each row has a hidden column listing originating peer node ID for change Distribution Agent on each node detects conflicts by comparing hidden column

New Topology Wizard makes setup/ changes easier

Failover Cluster

Failover clustering and Database Mirroring

Both solutions provide


Automatic detection Automatic, fast failover Manual failover Transparent client redirection Zero work loss Site redundancy

Database Mirroring

The business continuity problem Cold/ Warm/ Hot Standby Solutions Failover Clustering Database Mirroring Combining Availability Technologies

Failover Cluster

* Inst1
SRV2 SRV1 Hot Standby Automatic detection and failover Multiple nodes provide availability Failover transparent to client Windows Server 2008 EE supports 16 node clusters Windows 2008 clusters must pass validation tests (no HCL) Supports many scenarios: Multiple Active Instances, N+1, N+I

Multiple Active Instances

N+1: N Active, 1 Inactive Instances

N+I: N Active, I Inactive Instances

* Inst1 Inst3 * Inst2 *

* Inst1 Inst2 *

Failover Cluster

* Inst1

Node Majority Each node has a vote Cluster functions with majority of votes
Node and Disk Majority Each node plus the disk witness can vote Cluster functions with majority of votes Node and File Share Majority Each node plus the file share witness can vote Cluster functions with majority of votes

Majority means more than half

No Majority: Disk Only Cluster functions if one node is available with quorum disk presented Nodes in communication with quorum disk are cluster members

Failover Cluster

* Inst1

Choosing a Quorum Model


Description of Cluster Odd number of nodes Even number of nodes (single site) Even nodes, multi-site cluster Quorum Recommendation Node Majority Node and Disk Majority Node and File Share Majority

Failover Cluster

* Inst1

Zero work loss, minimal impact on throughput Instance Failover entire instance works as a unit Single copy of databases Available since SQL Server 7.0 No standby reporting, testing etc. Single cluster can support multiple instances Can provide site fault-tolerance

Failover Cluster

New setup for SQL Server 2008


No remote execution on cluster nodes Run SQL Server setup to Install/ uninstall / upgrade nodes Option 1: Integrated installation with Add Node Configure single-node cluster instance Run SQL Server Setup with Add Node functionality to add each node Option2: Advanced/Enterprise installation Run Setup with Prepare Failover Cluster on each node Run Setup on node owning shared disks to Complete Failover Cluster

Site redundancy
SQL Server requires single-subnet Stretch VLAN between sites

Requires synchronous storage replication Performance dependent on network RTT


Configure Heartbeat timeout Increase to extend clusters over further distances Decrease to increase failure sensitivity for faster failover

Two Quorum configurations suitable for multi-site failover clusters

Node and File Share Majority


Site C
\\SERV1\CLUS1
File Share Witness

Site A

Site B

SAN

SAN

STORAGE REPLICATION

Node Majority

Site A

Site B

SAN

SAN

STORAGE REPLICATION

The business continuity problem Cold/ Warm/ Hot Standby Solutions Failover Clustering Database Mirroring Combining Availability Technologies

Database Mirroring

Hot Standby Provides a fault-tolerant database Database Failover


Very fast failover
<5 secs (most cases)

Zero data loss

No special server, storage or network requirements Automatic or manual failover


Automatic re-sync after failover

Automatic, transparent client redirect

Application

Witness Mirror

Commit

Mirror always redoing log to remain current

Principal

2
SQL Server

SQL Server

2
Log

>2
Data

3
Log

>3
Data

Trade off: Performance vs. Safety High Availability - synchronous mirroring with a witness
Automatic detection/failover No data loss

High Protection - synchronous mirroring without a witness


Manual failover No data loss, downtime possible

High Performance - asynchronous mirroring


Manual failover Data loss possible

Available in SQL Server 2008 Enterprise Edition Automatic Page Repair


Auto-resolve some errors that prevent reading a page Retrieve fresh copy from partner Unreadable page is replaced by the copy

Repairs are asynchronous


Query on principal accessing corrupt page will fail If mirror accesses corrupt page - mirroring session is suspended

Not all errors can be repaired


823, 824 errors - Windows returns a CRC error when reading page 829 errors restore pending

Pages that cannot be repaired include:


file header page database boot page allocation bitmap pages (GAM, SGAM, PFS)

New DMV sys.dm_db_mirroring_auto_page_repair Repair logged to SQL Server Error log


... Error: 824, Severity: 24, State: 2. ... SQL Server detected a logical consistency-based I/O error: incorrect checksum ... ... Database mirroring is attempting to repair physical page (4:4256) in database MyDB" by requesting a copy from the partner. ... Database mirroring successfully repaired physical page (4:4256) in database MyDB" by obtaining a copy from the partner.

Log compression
DBM performance affected by network performance Useful low-bandwidth scenarios

Outgoing log stream - principal to mirror compressed


Requires CPU to compress/ decompress Can be disabled

High-bandwidth network performance


Heavy duty, fully-logged maintenance operations when mirroring is involved (e.g. index rebuild)

Log Compression Transactions/ second

Greatest improvement for low bandwidth networks

CPU Usage with Log Compression


CPU usage sometimes x2 Compression overhead Increased transaction throughput

The business continuity problem Cold/ Warm/ Hot Standby Solutions Failover Clustering Database Mirroring Combining Availability Technologies

Replication

Maximize availability for


Scale out
Offload primary data platform

Heavy reporting Mobile/disconnected users Autonomous business units that share data

Failover Solutions

Maximize availability of critical systems


Designed for failover
Fast, automatic

Zero data loss Transactionally current Masks planned and unplanned downtime

Failover Solution + Replication Fault-Tolerant Publisher and Distributor


Distributor Subscribers

Publisher

Principal Server can be a Failover Cluster


Failover to mirror will occur before failover within the cluster
Principal will assume role of Mirror

Mirror can be a Failover Cluster as well


Failover Cluster Failover Cluster

Principal

Mirror

Attribute Automatic failover Perceived downtime Potential data loss

Failover Cluster Yes 30 secs + recovery Single copy of data

Log Shipping No n/a Yes, most recent transaction log

Database Mirroring Yes, in High Safety < 5 secs Yes, in some configurations

Masking storage failure


Special hardware Scope

No
Must pass validation System and user databases

Yes
No User databases

Yes
No User databases

You might also like