O31 SQL Server High Availability: Mike Shelton
O31 SQL Server High Availability: Mike Shelton
O31
Mike Shelton
IBM xSeries
Aug. 9 - 13, 2004
Technical Conference
Chicago, IL
RETURN TO INDEX
Overview
• Defining High Availability
• Setting High Availability Goals and Identifying
Barriers
• SQL Server 2000 HA Technology
– Failover Clustering
– Log Backup Shipping
– Replication
• High Availability Operations and Support
Defining Five Nines
• What Is 99.999%?
– Target:
• Online and available to users 24 hours a day,
365 days a year
• Total outages less than 5.26 minutes per year
Well-managed Nodes
Masks Some Hardware Failures
Well-managed Packs and Clones
Masks Hardware Failures
Masks Operations Tasks (e.g. Software Upgrades)
Masks Some Software Failures
Well-managed Geoplex
Masks Site Failures (Power, Network, Fire, Move…)
Masks Some Operations Failures
Business Makes High
Availability Necessary
• Reliance on technology
– Hospital has high cost for downtime (lives!)
• Product or service availability
• Continuous improvement of products, services,
and processes
– Example: It would be a failure if “the person to call”
left the company two years ago and nobody can
currently offer expertise.
• DBA interest in continuous improvement of
products, services, and processes
What High Availability Is Not
• A Technology Solution From a Vendor
• A Scalability Solution
• An IT Decision Without Business Knowledge
• A Business Decision Isolated From the Cost of
Downtime
• It is:
– A solution involving people and process, and, very
likely, technology
High Availability Framework
Business Availability Goals
Other relevant factors
Database Size
Barriers to Availability
Throughput Requirements
6: Downtime 4: Development
5: Telecommunication Fees
Microsoft SQL Server 2000 High
Availability Technology
• SQL Server 2000 Editions Suitable for High
Availability
• Comparison of Standby Options
• High Availability Features
• Failover Clustering
• Log Shipping
• Transactional Replication
• Using Combinations of Technologies
SQL Server 2000 Editions Suitable
for High Availability
• Enterprise Edition
– Most Scalable and Highly Available
– Includes Failover Clustering
– Includes Log Shipping Features
– Suitable for Production
• Developer Edition
– Full Featured (Same As Enterprise Edition)
– Suitable for Development and Testing
Comparison of Standby Options
• Hot Standby
• Warm Standby
• Cold Standby
High Availability Features
• Standby Type
• Failure Detection
• Automatic Failover
• Masks Disk Failure
• Masks SQL Process Failure
• Masks Other Process Failure
• Meta Data Support
• Transactionally Consistent
• Transactionally Current
• Perceived Downtime
• Transparent to Client
• Special Hardware Needed
• Distance Limit
• Complexity
• Standby Accessible
• Impact on Performance
• Impact on Backup Strategy
Failover Clustering
• Types of Clusters
• Windows Clustering
• SQL Server 2000 Failover Clustering
• How Failover Clustering Works
• Enhancements to Failover to Clustering
• High Availability Features in Failover
Clustering
Types of Clusters
• Windows Cluster
• Failover Cluster
• Federated Cluster
• Network Load Balancing Cluster
Failover Clustering
Windows Clustering
• Hardware Components
– Cluster node
– Heartbeat
– External network
– Shared cluster disk array
– Quorum drive
• Software Components
– Cluster name
– Cluster IP address
– Cluster administrator account
– Cluster resource
– Cluster group
• Virtual Server
SQL Server 2000 Failover
Clustering
Public Network
4-nodes
High Availability Features in Failover
Clustering (1 of 2)
Availability Feature Failover Clustering
Standby Type Hot
Failure Detection Yes
Automatic Failover Yes
Masks Disk Failure No; Shared Disk
Masks SQL Process Failure Yes
Masks Other Process Failure Yes
Meta Data Support All System and Database
Transactionally Consistent Yes
Transactionally Current Yes, Always Up to Date
High Availability Features in
Failover Clustering (2 of 2)
Availability Feature Failover Clustering
Perceived Downtime 30 Seconds + DB Recovery
Transparent to Client Yes, Reconnect to Same IP
Special Hardware Needed Specialized Hardware from Cluster HCL
Distance Limit 100 Miles
Complexity More
Standby Accessible Standby never accessible
Impact on Performance No Impact
Impact on Backup Strategy Must be able to backup from any node
SQL Server Failover Clustering
• Hot Standby Solution
• Best High Availability Configuration
– Redundant System
– Shared Access to the Database Files
– Recovery in Seconds
– Automatic Failure Detection
– Automatic Failover
– Minimal Client Application Awareness
• Built on Microsoft Cluster Server
Log Shipping
• Warm Standby Solution
• Applies Transaction Log From Primary Server
(Primary) to Warm Standby (Secondary)
• Attributes of Log Shipping
– Warm Standby Available for Limited Read-Only Use
– All Logged Schema and Data Changes Applied
– Cannot Filter Changes for Partitioning or Subsets
• Manual Failure Detection; Manual Failover
Monitoring Server
1. BACKUP 3.
Transaction- RESTORE
Log Transaction-log
WITH STANDBY
Tranaction-Log Transaction-Log
Dump Dump
2. Log COPY (“Pulled”)
Transactionally Current Yes, Always Up to Date No, Since Last Log Backup
Transparent to Client Yes, Reconnect to Same IP No, App must know standby
Special Hardware Needed Specialized Hardware from No; Duplicate system needed
Cluster HCL
Distance Limit 100 Miles Dispersed
Complexity More Some
Standby Accessible Standby never accessible Yes, Multiple Copies, Read-only;
% depends on update frequency
Impact on Backup Strategy Must be able to backup from Minimal – many small backups
any node
High Availability Uses of Log
Shipping
• Shorter failover time
• If there is a high incidence of user error and a
need to recover data frequently without
recovering the whole database
– Allows you time-delay possibilities
• 5 hours behind
• 8 hours behind
• Increase data redundancy
• Less complex hardware – no HCL
When to Consider Using Replication
for HA
• After Considering Failover Clustering
• After Considering Log Shipping
• System and Some User Metadata is Not Replicated
• Failure Detection and Failover is Not Automatic
– Standby Server is Not Identical to the Primary
• Not Guaranteed to be Transactionally Current
– Merge Replication is not Transactionally Consistent
• Replication Uniquely Allows:
– Partitioning of Data on the Standby Server
However, standby server is not identical to primary server
– Offline Access to the Data without Periodic Termination
Transactional Replication
• Warm Standby Solution
• Propagates Transactions From Primary
Server (Publisher) to Warm Spare
(Subscriber)
• Use Replication to Create
– A Read-Only Spare
– A Scale Out Solution
– A Partitioned Solution
• Manual Failure Detection; Manual
Failover
Comparing Clustering, Log Shipping, and
Transactional Replication (1 of 2)
Availability Feature Failover Clustering Log Backup Shipping Transactional
Replication
Standby Type Hot Warm Warm
Failure Detection Yes No, NLB helps No
Automatic Failover Yes No, NLB helps No, NLB helps
Masks Disk Failure No; Shared Disk Yes Yes
Masks SQL Process Failure Yes Yes Yes
Questions??
RETURN TO INDEX