0% found this document useful (0 votes)

4K views44 pages

Troubleshooting SQL Server AlwaysOn

Uploaded by

Alejandro Valenzuela

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4K views44 pages

Troubleshooting SQL Server AlwaysOn

Uploaded by

Alejandro Valenzuela

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Troubleshooting SQL Server AlwaysOn

Vijay Rodrigues

Summary: SQL Server AlwaysOn is the latest High Availability (HADR) offering in Microsoft SQL
Server. SQL Server AlwaysOn has been introduced in SQL 2012. This document is meant as a
quick reference. This document has common troubleshooting information that may be have
been encountered either by me or by my colleagues, with troubleshooting steps/commands
that are publicly available in SQL Server Books Online (BOL) on MSDN. Rather than having this
information in multiple blogs posts (there are already quite a few on the internet), I just felt a
combined document may make this information more readable to a user, as a quick reference
guide.
Category: Quick Reference
Applies to: SQL Server 2012, SQL Server 2014
E-book publication date: February 2014
For more titles, visit the E-Book Gallery for Microsoft Technologies.

Copyright 2014 by Microsoft Corporation

All rights reserved. No part of the contents of this book may be reproduced or transmitted in any form or by any means without
the written permission of the publisher.

Microsoft and the trademarks listed at http://www.microsoft.com/about/legal/en/us/IntellectualProperty/Trademarks/EN-US.aspx

are trademarks of the Microsoft group of companies. All other marks are property of their respective owners.
The example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted
herein are fictitious. No association with any real company, organization, product, domain name, email address, logo, person,
place, or event is intended or should be inferred.
This book expresses the authors views and opinions. The information contained in this book is provided without any express,
statutory, or implied warranties. Neither the authors, Microsoft Corporation, nor its resellers, or distributors will be held liable
for any damages caused or alleged to be caused either directly or indirectly by this book.

Introduction
SQL Server AlwaysOn is the latest High Availability (HADR) offering in Microsoft SQL Server. SQL Server AlwaysOn
has been introduced in SQL 2012. This document is meant as a quick reference. This document has common
troubleshooting information that may be have been encountered either by me or by my colleagues, with
troubleshooting steps/commands that are publicly available in SQL Server Books Online (BOL) on MSDN. Rather
than having this information in multiple blog posts (there are already quite a few on the internet), I just felt a
combined document may make this information more readable to a user, as a quick reference guide. This is an
evolving product, so hopefully I will also have future versions of this document. This document is for
troubleshooting issues related to SQL Server AlwaysOn. For benefits, pre-requisites, and configuration, please
refer below documents. Ideally the latest SQL 2012 SP/CU should be ensured after appropriate testing, since these
may have fixes for known issues mentioned later in this document:

http://technet.microsoft.com/en-us/library/hh510230.aspx#Benefits (Benefits)
http://technet.microsoft.com/en-us/library/ff878487.aspx (Prerequisites, Restrictions, and
Recommendations for AlwaysOn Availability Groups (SQL Server))
http://technet.microsoft.com/en-us/library/ff878265.aspx (Creation and Configuration of Availability
Groups (SQL Server))

Tips to search this document: Try searching on error number, or on part of error message, or on performance
issue like hang, wait type like HADR_SYNC_COMMIT, or on database state like RECOVERY_PENDING or
RESOLVING (without quotes).
Disclaimer: This document is provided AS IS with no warranties, and confers no rights. This is purely for
informational purposes. The purpose is merely to provide the basic knowledge for own personal and noncommercial use, and is not meant for advice. Use with appropriate testing.

Section a - Troubleshooting Applications

Application connectivity introduction.

Applications should use MultiSubnetFailover as indicated
http://support.microsoft.com/kb/2792139 (Time-out error
connect to a SQL Server 2012 AlwaysOn availability group
subnet environment). Additional application connectivity
reference.

in
and you cannot
listener in a multilinks mentioned for

http://technet.microsoft.com/en-us/library/gg471494.aspx
Client Support for High Availability, Disaster Recovery)

(SQL Server Native

http://technet.microsoft.com/en-us/library/hh213417.aspx (Availability Group

Listeners, Client Connectivity, and Application Failover (SQL Server))
http://technet.microsoft.com/en-us/library/gg558121.aspx
Support for High Availability, Disaster Recovery)

(JDBC Driver

Application connection string.

Connection string should use ODBC or SQL OLE DB in SNAC (application
intent=readonly is optional and depends on whether read only connections are
supported):
To use SQL Native Client SQL OLEDB, change your connection string to this
(this one is using integrated security):
provider=sqlncli11;data source= tcp:AGListener,1633;database=ag;integrated
security=sspi;application intent=readonly;MultiSubnetFailover=True
To use .Net SQLClient:
data source= tcp:
AGListener,1633;database=ag;user=sa;password=Password2;applicationintent=read
only
Finally, to use SQL Native Client ODBC, connection string can be like below:
driver={SQL Server Native Client 11.0};server= tcp: AGListener,1633;database=
CGFData;trusted_connection=yes;applicationintent=readonly;MultiSubnetFailover
=True

Application reconnects takes 1mins and 30 seconds, even though Failover of database takes 6 to 10
seconds.
Filter driver (like anti-virus etc.) may be causing the slowness in
connection. Check if anti-virus is up to date, and if SQL files are excluded
as indicated in http://support.microsoft.com/kb/309422.

Application/osql using AlwaysOn database gets disconnected when executing a failover of the Availability
Group.

This is expected. Application should have connection retry logic.

Application hang after AlwaysOn group failover.

If Java based application, then Java does not have command timeout (be
default, it is not limited). But .net has 30 sec default command timeout,
thats why .NET has no issue. Set commandtimeout in Java.

Application connects to primary replica every time even when the parameter Connection Intent =Read
Only is specified in the connection string.
Check if Routing URL is defined for each server, and if Routing List was also
not defined.
If the database part is omitted (in connection string that has AG listener),
readonly routing does not work

Intermittent timeout only for some applications. These applications are hosted on Linux/Unix.
Install SQL Server ODBC driver for Linux and check if issue reproduces with a
string like SQLCMD SAGListenerFQDN.com M with appropriate credentials.
This driver is available at http://www.microsoft.com/enus/download/details.aspx?id=28160 .
If issue does not occur with SQLCMD, then an option is to create a sample
application (so that production apps are not impacted by this data capture
attempt, since they can keep connecting directly to the SQL instance as
currently done) on any one app server (or in different server in same
subnet/datacenter) that attempts connection AGL say every 30 seconds. If this
too encounters the issue, once every about 15 minutes, then we may have a
repro. This too should not impact production.
If sample application too does not reproduce intermittent issue, then only
other way to reproduce issue is to point application to AGL.
If we have a repro, then a simultaneous Wireshark capture can be made from
application box and SQL active node so as to capture trace during issue
occurrence (and noting issue time). The capture should be saved in .cap
format to aid analysis. Based on this, additional traces may be required.
Wireshark is third-party and runs on Windows/Linux/Unix as documented in
their site http://www.wireshark.org/faq.html#q1.1 .

Always on Availability Group. Application is using SQL Login to access the databases. The application is
not able access the database after the database is failed over to the secondary.
Security identifier (SID) of login may be different for the user in both
instances. So a login with same SID (same as on primary) has to be created on
secondary.
TSQL: SELECT name, sid FROM sys.database_principals;

TSQL: CREATE LOGIN [LLL] WITH PASSWORD='dddd', DEFAULT_DATABASE=[master],

DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF,
CHECK_POLICY=OFF,SID=0xABC;

Application encounters ODBC error after AlwaysOn group is failed over to secondary. Works fine when
AlwaysOn group is on primary. [Microsoft][ODBC SQL Server Driver][SQL Server]The EXECUTE permission
was denied on the object 'FN_ADJUSTED_DATE', database 'MyDB', schema 'dbo'.
SQL Native Client 11.x does support the new connection parameters. Older
versions of SQL Native Client do NOT support ApplicationIntent parameter.
Upgrade/install SQL Native client on the client application server. This will
upgrade ODBC etc. components on application server.

After fail over of Availability Group from one subnet to another, the ping command (to listener) from the
remote client is not resolving to newly current active IP. DNS entry for the Listener network name shows
IPs of both subnets.
If value of RegisterAllProvidersIP is set to 1 (default) for the listener on
cluster nodes, then change to 0. Value change requires the cluster service to
be cycled or for the listener network name (client access point, CAP)
resources to be restarted. This generally occurs when CAP/listener is created
using Failover Cluster Manager (FCM), rather than from SSMS (suggested).
Powershell:
Import-Module FailoverClusters
Get-ClusterResource yourListenerName|Set-ClusterParameter
RegisterAllProvidersIP 0
Cluster.exe:
cluster /cluster:<ClusterName> res <NetworkNameResource> /priv
RegisterAllProvidersIP=0

HostRecordTTL is set to 60, RegisterAllProvidersIP is set to 0, but ping to listener is still returning wrong IP
(after Availability Group failover to different subnet) for over a minute.
From client/application system, open administrator command prompt and try
ipconfig /flushdns.

Section b - Troubleshooting Network

Error TCP Provider, error: 0 - An operation on a socket could not be performed because the system
lacked sufficient buffer space or because a queue was full.
NetStat output may show hundreds of entries in TIME_WAIT state leading to
buffer/port exhaustion
Add registry setting for MaxUserPort http://support.microsoft.com/kb/196271
Add registry setting for TCPTIMEWAITDELAY.
App/IIS restart, machine reboot are additional options.

Section c - Troubleshooting Performance

Frequently Asked Questions (FAQs).

Please refer AlwaysOn Availability Groups - FAQ Part 1 and Part 2 site
links mentioned below, since they have a lot of good questions. Please note
this is a third party site and not an MS site. As indicated in the links,
these are questions and answers when discussing with an MS Program Manager of
AlwaysOn.
http://communities.quest.com/community/dataprotection/blog/2012/08/23/alwayson-availability-groups--common-questions-faq
http://communities.quest.com/community/dataprotection/blog/2012/11/21/alwayson-availability-groups--faq-part-2

Want to increase number of SQLs default health monitor files. This is useful for maintaining history,
especially where multiple failovers may be involved.
-- Below to change size/number of default system_health sessions
(\LOG\system_health_*.xel). -- Applicable for all systems, including
standalone.
-- Does not require session to be stopped.
ALTER EVENT SESSION [system_health] ON SERVER DROP TARGET package0.event_file
ALTER EVENT SESSION [system_health] ON SERVER ADD TARGET
package0.event_file(SET
filename=N'system_health.xel',max_file_size=(5),max_rollover_files=(4))
-- Below to change size/number of default AlwaysOn session
(\LOG\AlwaysOn_health_*.xel).
-- Application for system that have SQL availability groups.
-- Does not require session to be stopped.
ALTER EVENT SESSION [AlwaysOn_health] ON SERVER DROP TARGET
package0.event_file;
ALTER EVENT SESSION [AlwaysOn_health] ON SERVER ADD TARGET
package0.event_file (SET
filename=N'AlwaysOn_health.xel',max_file_size=(5),max_rollover_files=(4));
-- Below to change size/number
-- Applicable only for SQL FCI
ALTER SERVER CONFIGURATION SET
ALTER SERVER CONFIGURATION SET

of default FCI logs (\LOG\_SQLDIAG_.xel).

(Failover Cluster Instance).
DIAGNOSTICS LOG MAX_SIZE = 10 MB;
DIAGNOSTICS LOG MAX_FILES = DEFAULT;

-- Below to change number of ERRORLOG files.

-- Applicable for all systems.
-- SSMS > Management > right click SQL Server Logs > Configure > Check box
"Limit the number of error log files before they are recycled" > increase the
number from 6 to 99, or to an appropriate number.

Disks are not detected if there is only one node at the secondary site.

This is
Windows
it will
default

a limitation of Windows 2008 R2 cluster. This issue does not occur in

2012. PowerShell can be used in Windows 2008 R2 to add the disk and
work. Will need to modify the Possible Owners for the resource, as by
it will have all nodes checked.

Add-ClusterResource -Group "Available Storage" - Cluster "myclustername" Name "diskname" -ResourceType "Physical Disk"
Get-ClusterResource "diskname" -Cluster "myclustername" | setclusterparameter DiskPath "F:"
# In above, F: is the drive letter assigned in disk management for the disk.

Slow synchronization. Waittime for HADR_SYNC_COMMIT grows anywhere from 500ms to 900ms
(compared to less than 15-20 ms).
If KB2723814 not applied, then try the KB workaround of suspend secondary
replica and then resume, so that AlwaysOn knows that availability mode has
changed back to synchronous commit.

SQL Server Agent Jobs do not automatically failover, when participating in AlwaysOn.
This is by design. Suggestion is to create the job at both Primary and
Secondary and enable them. Include logic in job step that checks the
role_desc in sys.dm_hadr_availability_replica_states of the database. If
role_desc is primary then execute the job, and if role_desc is secondary then
exit the job.
TSQL: select role_desc from sys.dm_hadr_availability_replica_states where
is_local=1 and role=1;

Reason why the secondary replica becomes unavailable when SQL Server service is stopped on primary
node.
OR AlwaysOn failback is not working.
This is as expected. Increment "the maximum number of failures during this
period" count. Its default value on n node cluster is n-1.
The secondary connects to the Primary and not the other way around. If
secondary is trying to connect to the primary, and primary is down, the state
will be RESOLVING. For example, if the SAN was taken offline that hosted the
AlwaysOn database on the primary, the secondary was no longer able to connect
to that database, so was not synchronized and could not come online. This is
an expected behavior (By Design).

Primary replica database becoming unresponsive

While checking root cause, ensure latest SQL/Windows fixes and can set the
following on availability groups so they are not adversely effected by any
non-yielding events temporarily.
Set the availability group FAILURE_CONDITION_LEVEL to 1 which will reduce the
SQL Server symptoms that can result in health detection failure alert.

To specifically address the lease timeout, increase the availability group

HEALTH_CHECK_TIMEOUT setting from default 30 seconds to a higher setting
(e.g. 90 seconds). The signal interval for the lease is 1/3 the
HEALTH_CHECK_TIMEOUT, so a 10 sec gap can result in lease expiration.

Questions on secondary replicas. The function sys.fn_hadr_backup_is_preferred_replica returns 0 no matter

what the backup preference is set to.
Check if @@SERVERNAME returns correct value on that server.
TSQL: SELECT @@SERVERNAME;

Linked server configuration with AlwaysOn listener.

TSQL: EXEC master.dbo.sp_addlinkedserver @server = N'MYLISTENER',
@srvproduct=N'SQL2012', @provider=N'SQLNCLI11', @datasrc=N'MYLISTENER',
@provstr=N'Provider=SQLNCLI11.1; Data
Source=myListener;ApplicationIntent=READONLY', @catalog=N'DB1'
As in the previous version in order to use the SQL Server Native Client
(11.x) in SQL Server 2012, Feature Pack 2012 must be installed. Microsoft
SQL Server 2012 Feature Pack is available at http://www.microsoft.com/enus/download/details.aspx?id=29065.
SQL Native Client (SNAC) can also be installed on application server by
running SQL Setup and installing Client Tools Connectivity. Client Tools
includes components for communication between clients and servers, including
network libraries for DB-Library, OLEDB for OLAP, ODBC, ADODB, and ADOMD+.
To verify, SQL Native Client DLL is available
"C:\Windows\System32\sqlncli11.dll"

SQL ERRORLOG indicates stack dump with expression pcbActualData <= cbRemainingBuffer.
***Stack Dump being sent to E:\MSSQL11.MSSQLSERVER\MSSQL\LOG\SQLDump0007.txt
* BEGIN STACK DUMP:
* Location: HadrAvailabilityGroupReplica.cpp:943
* Expression:
*pcbActualData <= cbRemainingBuffer
Check if memory messages in SQL ERRORLOG which indicates possible memory
pressure. If such messages exist, then ensure sp_configure max server
memory is set. Point to note is that CLR is part of SQL buffer pool memory
in SQL 2012. If youre using CLR, it should be accommodated within Max Server
Memory. If memory messages present and if LogPool memory appears to be high,
are the replicas connected through a fast network? Also, it is possible that
it is so heavily transactional that the number of log records generated is
high and with the amount of databases it pushes this memory above the roof.
As such, you may want to consider increasing the memory/RAM on the box.

FAIL_PAGE_ALLOCATION 1 in SQL ERRORLOG when using AlwaysOn Functionality.

Check the max and min server memory correctly.

TSQL: EXEC sp_configure 'max server memory';

Slow commit performance problem for replica in synchronous commit mode.

Check perfmon counters Replica:Transaction Delay counter, Replica:Log Send
Queue.

High HADR_WORK_QUEUE wait.

This wait indicates AlwaysOn Availability Groups background worker thread
waiting for new work to be assigned. This is an expected wait when there are
ready workers waiting for new work, which is the normal state.

High HADR_LOGCAPTURE_WAIT wait.

Check perfmon counters average log bytes flushed / sec, log bytes received
/sec. If log bytes received /sec is much higher, then this may indicate that
the log scan could be a bottleneck.

Section d - Patching/updates

Availability Group(s) with replicas on standalone instances.

Patching steps:
1. Patch the secondary replica (B)
2. Bring the secondary replica online (will be new version)
3. Log the original synchronization configurations for each replica. Change
the secondary replica and primary replica to Synchronous Commit mode,
waiting for the secondary replica (B) to be synchronized.
a. This will ensure there is no data loss during failover.
b. You can check dashboard or dmv: dm_hadr_database_replica_states for
the status.
4. Issue a failover through SSMS or T-SQL to failover the AG to the
secondary replica. So now the new primary is B, the new secondary is A.
5. Patch the original primary replica (A)
6. Bring the original primary replica (A) online (will be new version )
7. Wait for A to become Synchronized
8. Failover the AG back to A
9. Change each replicas synchronization mode to the original configurations
you logged in step3.
Caveat list:
1. Before patching, can still keep automatic/manual failover setting with no
change. Just a reminder: if during patching time, primary is down,
automatic failover may fail even if the secondary hasnt completed the
patching.
2. If primary and secondary replicas are in multi-subnet, clients may
experience a little bit longer time of disconnection or timeout during
failover.
3. Please do remember to switch back to original synchronization mode.

Availability Group(s) with replica(s) on FCIs (Failover Cluster Instances).

Example environment:
o Primary replica (A): FCI1 - Node1 is active, Node2 is passive.
o Secondary replica (B): FCI2 - Node3 is active, Node4 is passive.
Choose between one of the following two options.
Longer
1.
2.
3.
4.
5.
6.
7.
8.

downtime, less
Patch Node4
Move FCI2 from
Patch Node3
Move FCI2 from
Patch Node2
Move FCI1 from
Patch Node1
Move FCI1 from

steps (FCI rolling patching). Patching steps (basic):

Node3 to Node4
Node4 to Node3
Node1 to Node2
Node2 to Node1

Optimized downtime, more steps (Leverage AG failover patching). Patching

steps (optimized):
1. Patch Node4

2.
3.
4.
5.

Move FCI2 from Node3 to Node4

Patch Node3
Move FCI2 from Node4 to Node3
Log the original synchronization configurations for
Change the secondary replica and primary replica to
Commit mode, waiting for the secondary replica (B)
synchronized.
6. Manually failover AG from FCI1 to FCI2 (now the new
FCI2, the new secondary is on FCI1)
7. Patch Node2
8. Move FCI1 from Node1 to Node2
9. Patch Node1
10.
Move FCI1 from Node2 to Node1
11.
Manually failover AG from FCI2 back to FCI1
12.
Change each replicas synchronization mode to
configurations you logged in step #5.

each replica.
Synchronous
to be
primary is on

the original

Caveat list:
1. ** Please use basic patching steps if primary and secondary are on
different data centers/subnets **
2. FCI rolling patch guarantees zero data lost
3. Use synchronous commit secondary on a high latency network would
impact OLTP performance
4. AG failover cross subnets might cause up to 20 seconds delay on client
first connections.

Section e - Hotfixes

AlwaysOn hotfixes in SQL 2012 - PCUs / CUs (includes latest release cycle of service pack and cumulative
updates).

These hotfixes may be directly related to AlwaysOn, and/or may be related to the working of a SQL instance using
AlwaysOn.
Fix / KB article

Update

Build
(11.00.
n)
2100

Release
date

SQL Server 2012 RTM

RTM

FIX: Error code 20598 when a failover operation occurs

during synchronization on a SQL Server 2012 AlwaysOn
failover cluster instance
FIX: Secondary databases in a secondary replica may be
in an "unknown" state if you join the secondary replica
into availability groups two times in SQL Server 2012
FIX: Error 41009 when you try to create multiple
availability groups in a SQL Server 2012 AlwaysOn
failover clustering environment
FIX: Availability group failover takes a long time if a
database in the availability group contains a FileTable in
SQL Server 2012
FIX: New Availability Group Wizard-generated scripts
skip the steps for joining a secondary database to an
availability group in SQL Server 2012
You experience slow synchronization between primary
and secondary replicas in SQL Server 2012
FIX: Access violation in the
sqlservr!ReplicaToPrimaryPageCopier::ReadIoCompletio
nRoutine function in SQL Server 2008 R2 or in SQL
Server 2012
SQL Server 2012 experiences out-of-memory errors

RTM
CU1

2316

2012
Feb

RTM
CU2

2325

2012
Apr

SP1

3000

2012
Nov

Remarks/KB

RTM
CU1
KB 2711145

SP1

KB 2723814

SP1
CU1

3321

2012
Oct

3339

2012
Dec

Description of new features in SQL Server 2012 and SQL

Server 2008 R2 SP2

SP1
CU2
SP1
CU2

Improved Metadata Discovery process performance in

SQL Server Native Client for SQL Server 2012

SP1
CU3

3349

FIX: Poor performance in SQL Server 2012 when you

SP1

2013
Feb

KB 2792921. Expands
the supported features
of SQL Server Sysprep.
Not directly related to
AlwaysOn.
KB 2772525.
Performance
improvement in
sp_describe_first_result
_set.
KB 2803529. Memory

run a SQL Server trace

CU3

FIX: High "log write waits" counter value on a SQL

Server 2012 node
SQL Server 2012 performance issues in NUMA
environments
FIX: Out-of-memory errors related to a memory clerk in
SQL Server 2012
An update is available for SQL Server 2012 Memory
Management

SP1
CU3
SP1
CU3
SP1
CU4
SP1
CU4

FIX: Scheduler deadlock on AlwaysOn Availability Group

primary replica in SQL Server 2012
FIX: Error 14420 when you enable Log Shipping on
databases that are in an AlwaysOn availability group in
SQL Server 2012
FIX: A memory leak occurs when you enable AlwaysOn
Availability Groups or SQL Server failover cluster in
Microsoft SQL Server 2012
FIX: "RESTORE DATABASE is terminating abnormally"
error when you restore the secondary database in SQL
Server 2012

SP1
CU5
SP1
CU6

consumption by SQL
trace
(sp_TraceGetdata).

KB 2819662. NUMA
memory enhancement.
3368

3373
3381

2013
Mar

2013
Aug
2013
Sep

KB 2845380 - "We
recommend that you
install this hotfix as
soon as possible".
KB 2869734.
KB 2872854.

KB 2877100.

KB 2884126.

Related SQL Server, Operating System (OS) and additional fixes.

Fix / KB article

Eligibility

Release date

Remarks / KB

A hotfix is available to let you

configure a cluster node that does not
have quorum votes in Windows
Server 2008 and in Windows Server
2008 R2
SQL Server 2012 service crashes when
a replica SQL Server 2012 instance
goes offline on a Windows Server
2008 R2-based failover cluster
An update introduces support for the
AlwaysOn features from SQL Server
2012 to the .NET Framework 3.5 SP1
A hotfix that improves the
performance of the "AlwaysOn
Availability Group" feature in SQL
Server 2012 is available for Windows
Server 2008 R2
A transient communication failure

OS 2008/R2 RTM/SP1+

KB 2494036, OS fix

OS 2008 R2 SP1+

OS fix

.NET 3.5 SP1

2012 Jan

2654347

OS 2008 R2 SP1+

2012 Mar

KB 2687741, OS fix

OS 2008 R2 RTM/SP1+

KB 2550886

causes a Windows Server 2008 R2

failover cluster to stop working
Cluster node cannot rejoin the cluster
after the node is restarted or removed
from the cluster in Windows Server
2008 R2
Cluster service still uses the default
time-out value after you configure the
regroup time-out setting in Windows
Server 2008 R2
A Windows Server 2008 R2 failover
cluster loses quorum when an
asymmetric communication failure
occurs
Time-out error and you cannot
connect to a SQL Server 2012
AlwaysOn availability group listener in
a multi-subnet environment (also has
link to Sharepoint 2010 fix to add
support for MultiSubnetFailover)
Connection times out when you use
AlwaysOn availability group listener
with MultiSubnetFailover parameter
Cluster service leaks memory when
the service handles state change
notifications in Windows Server 2008
R2 or Windows Server 2008
Hotfix to add support for asymmetric
storages to the Failover Cluster
Management MMC snap-in for a
failover cluster that is running
Windows Server 2008 or Windows
Server 2008 R2
Windows Installer starts repeatedly
after you install SQL Server 2012 SP1
Can't access VNN FILESTREAM share
when you use the FILESTREAM and
FileTable features on a Windows
Server 2012-based failover cluster
SQL Server 2012 service shuts down
unexpectedly upon availability group
replica role transition on a Windows
Server 2008 R2-based failover cluster

Pre-requisite fixes.

OS 2008 R2 RTM/SP1+

KB 2549472

OS 2008 R2 RTM/SP1+

KB 2549448

OS 2008 R2 SP1+

KB 2552040

SQL 2012

KB 2792139, workaround

SQL 2012

KB 2855417, workaround

OS 2008 SP2, 2008 R2

RTM/SP1

KB 2550894

OS 2008/2008R2 RTM

KB 976097

SQL2012 SP1

KB 2793634, CU2

OS 2012

KB 2835620

OS 2008 R2 SP1

KB 2777201

Pre-required fixes are indicated in http://technet.microsoft.com/en-us/library/ff878487.aspx (Prerequisites,

Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)). All fixes should be
installed on all cluster nodes (or at least on nodes that will have Availability Group failover).

Section f - Reference documents for working of AlwaysOn

Applications and AlwaysOn

Prerequisites, Restrictions, and Recommendations for AlwaysOn Client Connectivity (SQL Server)
http://technet.microsoft.com/en-us/library/hh510238.aspx
Configure a Server to Listen on a Specific TCP Port (SQL Server Configuration Manager)
http://msdn.microsoft.com/en-us/library/ms177440.aspx
The Database Mirroring (and AlwaysOn Availability Group) Endpoint
http://msdn.microsoft.com/en-us/library/ms179511.aspx
sqljdbc DriverManager.getConnection Hangs after mirror failover (infinite timeout scenario)
http://social.msdn.microsoft.com/Forums/sqlserver/en-US/5137ab7b-ced5-4365-8020-5cb821b5a3bc/sqljdbcdrivermanagergetconnection-hangs-after-mirror-failover
DB Failover causes application hang using JDBC
http://social.msdn.microsoft.com/Forums/sqlserver/en-US/03be11ee-653e-48f3-aa78-a312a8b5335f/db-failovercauses-application-hang-using-jdbc
SqlClient Support for High Availability, Disaster Recovery
http://msdn.microsoft.com/en-us/library/hh205662.aspx
SQL Server Native Client Support
http://msdn.microsoft.com/en-us/library/gg471494(v=sql.110).aspx
An update introduces support for the AlwaysOn features from SQL Server 2012 to the .NET Framework 3.5 SP1
(needs to be installed on each Reporting Services report server)
http://support.microsoft.com/kb/2654347
Network port (and firewall) requirements for Windows
http://support.microsoft.com/kb/832017/en-us
Configure a Windows Firewall for Database Engine Access
http://msdn.microsoft.com/en-us/library/ms175043.aspx
Configure and manage SQL Server availability groups for SharePoint Server (Support for SQL 2012 AlwaysOn with
SharePoint Foundation 2010 ) (Readable secondaries not supported)
http://technet.microsoft.com/en-us/library/hh913923(v=office.14).aspx

Availability Group listener in SQL 2012

Limitations, using SQL Server Management Studio for listener, CNO permissions/maximum, prestaging
http://msdn.microsoft.com/en-us/library/hh213080.aspx
Troubleshooting listener creation
http://support.microsoft.com/kb/2829783

Availability Groups in SQL 2012

Script for SQL Agent job alert.

Script basically throws exceptions in case of errors so does not check for "Not Synchronizing" state etc.
Scripts monitors single Availability Group and should be run on primary replica.
http://blogs.msdn.com/b/sqlalwayson/archive/2012/02/15/monitoring-alwayson-health-with-powershell-part3.aspx
http://blogs.msdn.com/b/sqlalwayson/archive/2012/02/15/the-always-on-health-model-part-4.aspx
http://gallery.technet.microsoft.com/scriptcenter/AlwaysOn-Availability-23bc88b5 (monitorag.ps1)
Includes steps after creation (sections 'Creating and Configuring a New Availability Group', 'Managing Availability
Groups, Replicas, and Databases', 'Monitoring Availability Groups').
http://technet.microsoft.com/en-us/library/gg509118.aspx
GUI steps for custom dashboard conditions
http://blogs.msdn.com/b/sqlalwayson/archive/2012/02/13/extending-the-alwayson-health-model.aspx
http://technet.microsoft.com/en-us/library/hh510235.aspx (includes predefined policies, policies in Server facet).
Monitoring of Availability Groups (SQL Server)
http://msdn.microsoft.com/en-us/library/ff877954
Overview, planning
http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26FEF9550EFD44/Microsoft%20SQL%20Server%20AlwaysOn%20Solutions%20Guide%20for%20High%20Availability
%20and%20Disaster%20Recovery.docx
Who is using AlwaysOn
http://blogs.msdn.com/b/sqlalwayson/archive/2012/01/10/who-is-using-alwayson.aspx
AlwaysOn FAQ, capabilities for SQL Server 2012
http://msdn.microsoft.com/en-us/sqlserver/gg508768
Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)
Includes hardware, NIC (network adapter) recommendations, HostRecordTTL, same/different network link, client
connectivity
, maximum limits, FCI/database requirements, file path, TDE protected databases
, thread usage.
http://technet.microsoft.com/en-us/library/edbab896-42bb-4d17-8d75-e92ca11f7abb
http://blogs.msdn.com/b/saponsqlserver/archive/2012/02/07/sql-server-2012-alwayson-what-is-it.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2012/02/19/sql-server-2012-alwayson-part-2-quorumdetection.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2012/02/20/sql-server-2012-alwayson-part-3-sapconfiguration-with-two-secondary-replicas.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2012/02/29/sql-server-2012-alwayson-part-4-sapconfiguration-in-geo-cluster-configuration.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2012/03/29/sql-server-2012-alwayson-part-5-preparing-tobuild-an-alwayson-availability-group.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2012/03/29/sql-server-2012-alwayson-part-6-building-analwayson-availability-group.aspx
behind the scenes and into the details of what does happen when creating an Availability Group
http://blogs.msdn.com/b/saponsqlserver/archive/2012/04/24/sql-server-2012-alwayson-part-7-details-behindan-alwayson-availability-group.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2012/07/10/sql-server-2012-alwayson-part-8-failovermechanism-with-sap-netweaver.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2013/01/24/alwayson-part10-alwayson-and-logshipping.aspx

(part 9)
http://blogs.msdn.com/b/saponsqlserver/archive/2013/01/25/alwayson-part-10-switching-scheduled-tasksautomatically.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2013/04/21/sql-server-2012-alwayson-part-11-performanceaspects-and-performance-monitoring-i.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2013/04/24/sql-server-2012-alwayson-part-12-performanceaspects-and-performance-monitoring-ii.aspx
AlwaysOn: Minimizing blocking of REDO thread when running reporting workload on Secondary Replica
http://blogs.msdn.com/b/sqlserverstorageengine/archive/2011/12/22/alwayson-minimizing-blocking-of-redothread-when-running-reporting-workload-on-secondary-replica.aspx
AlwaysOn: Impact of mapping reporting workload on Readable Secondary to Snapshot Isolation
http://blogs.msdn.com/b/sqlserverstorageengine/archive/2011/12/22/alwayson-impact-of-mapping-reportingworkload-to-snapshot-isolation-on-readable-secondary.aspx
AlwaysOn: Readable Secondary and data latency
http://blogs.msdn.com/b/sqlserverstorageengine/archive/2011/12/22/alwayson.aspx
AlwaysOn: I just enabled Readable Secondary but my query is blocked?
http://blogs.msdn.com/b/sqlserverstorageengine/archive/2011/12/22/alwayson-i-just-enabled-readblesecondary-but-my-query-is-blocked.aspx
Active Secondaries: Backup on Secondary Replicas (supported and not supported options)
http://msdn.microsoft.com/en-us/library/hh245119
Configure Read-Only Routing
http://msdn.microsoft.com/en-us/library/hh710054.aspx
Select Initial Data Synchronization Page
http://technet.microsoft.com/en-us/library/hh231021
Enable and Disable AlwaysOn Availability Groups
http://msdn.microsoft.com/en-us/library/ff878259.aspx
ALTER AVAILABILITY GROUP
http://msdn.microsoft.com/en-us/library/ff878601.aspx
Features Supported by the Editions of SQL Server 2012
http://msdn.microsoft.com/en-us/library/cc645993.aspx#High_availability
DNS settings in a multi-site failover cluster
http://technet.microsoft.com/en-us/library/dd197562(WS.10).aspx
Includes multi-site/subnet FCI for AlwaysOn
http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26FEF9550EFD44/SQLServer2012_MultisiteFailoverCluster%20(2).docx
Requirements and Recommendations for a Multi-Site (multi-subnet) Failover Cluster
http://technet.microsoft.com/en-us/library/dd197575(WS.10).aspx
Failover Clustering and AlwaysOn Availability Groups (SQL Server)
http://msdn.microsoft.com/en-us/library/ff929171.aspx
Failover Clustering in Windows Server 2008 R2
http://download.microsoft.com/download/5/B/D/5BD5C253-4259-428B-A3E41F9C3D803074/WS08%20R2%20Failover%20Clustering%20White%20PaperTDM.docx
Requirements and Recommendations for a Multi-Site Failover Cluster (Windows 2008)
http://technet.microsoft.com/en-us/library/dd197575(v=ws.10).aspx
Configuring IP Addresses and Dependencies for Multi-Subnet Clusters
http://blogs.msdn.com/b/clustering/archive/2011/01/05/10112055.aspx

http://blogs.msdn.com/b/clustering/archive/2011/01/19/10117423.aspx
http://blogs.msdn.com/b/clustering/archive/2011/08/31/10204142.aspx
sqlcmd Utility (-M multisubnet_failover)
http://msdn.microsoft.com/en-us/library/ms162773.aspx
SQL Server Multi-Subnet Clustering (SQL Server)
http://msdn.microsoft.com/en-us/library/ff878716.aspx
Time-out error and you cannot connect to a SQL Server 2012 AlwaysOn availability group listener in a multisubnet environment (also mentions Sharepoint 2010 CU)
http://support.microsoft.com/kb/2792139
Worker pool usage
http://blogs.msdn.com/b/psssql/archive/2012/05/17/alwayson-hadron-learning-series-worker-pool-usage-forhadron-enabled-databases.aspx
HADR_SYNC_COMMIT
http://support.microsoft.com/kb/2723814
"When the session queries the sys.dm_exec_requests dynamic management view, the session performs DML
against the availability databases. In this state, the prevalent reported waittype is HADR_SYNC_COMMIT."
Use the AlwaysOn Dashboard
http://msdn.microsoft.com/en-us/library/hh213474.aspx
Availability group is offline
http://msdn.microsoft.com/en-us/library/hh230829.aspx
sys.dm_os_wait_stats (Transact-SQL) (HADR waits)
http://msdn.microsoft.com/en-us/library/ms179984.aspx
Windows Server Catalog (for checking certified configuration especially with respect to virtualization or virtual
machines)
http://www.windowsservercatalog.com/
Recommended Windows Hotfix for Database Availability Groups running Windows Server 2008 R2
http://blogs.technet.com/b/exchange/archive/2011/11/20/recommended-windows-hotfix-for-databaseavailability-groups-running-windows-server-2008-r2.aspx
Concurrent ADD NODE operation yields unexpected results in a SQL Server Failover Cluster Instance
http://blogs.msdn.com/b/sqlalwayson/archive/2012/03/30/concurrent-add-node-operation-yields-unexpectedresults-in-a-sql-server-failover-cluster-instance.aspx
DO NOT use Windows Failover Cluster Manager to perform Availability Group Failover
http://blogs.msdn.com/b/sqlalwayson/archive/2012/03/30/do-not-use-windows-failover-cluster-manager-toperform-availability-group-failover.aspx
Asymmetric storage, Quorum voting, changing quorum model, Read-Write and Read-only workloads, 'Recovering
from a Disaster'
http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26FEF9550EFD44/Building_a_HA_and_DR_Solution_using_AlwaysON_SQL_FCIs_and_AGs%20v1.docx
How DNS registration happens with network name resource.
http://blogs.msdn.com/b/clustering/archive/2009/07/17/9836756.aspx
Understanding Requirements for Failover Clusters
http://technet.microsoft.com/en-us/library/cc771404.aspx
Steps for prestaging the cluster name account
http://technet.microsoft.com/en-us/library/cc731002(v=ws.10).aspx#BKMK_steps_precreating
Failover Cluster Step-by-Step Guide: Configuring the Quorum in a Failover Cluster (Windows 2008/2008R2)

http://technet.microsoft.com/en-us/library/cc770620(v=ws.10).aspx
Windows Server Failover Clustering (WSFC) with SQL Server
http://msdn.microsoft.com/en-us/library/hh270278.aspx
Understanding Quorum Configurations in a Failover Cluster
http://technet.microsoft.com/en-us/library/cc731739.aspx
Configure Heartbeat and DNS Settings in a Multi-Site Failover Cluster
http://technet.microsoft.com/en-us/library/dd197562%28WS.10%29.aspx
Modifying the Settings for a Clustered Service or Application (The default value for the maximum number of
failures is n-1, where n is the number of nodes)
http://technet.microsoft.com/en-us/library/cc755151.aspx
Windows cluster
http://blogs.msdn.com/b/psssql/archive/2011/04/26/alwayson-hadron-leanring-series-maximum-failovers-withinspecified-period.aspx
Steps for troubleshooting problems caused by changes in cluster-related Active Directory accounts
http://technet.microsoft.com/en-us/library/cc731002(v=WS.10).aspx
http://blogs.msdn.com/b/psssql/archive/2013/09/30/error-during-installation-of-an-sql-server-failover-clusterinstance.aspx
Automatic Failover of an Availability Group
http://msdn.microsoft.com/en-us/library/hh710061.aspx
AlwaysON - HADRON Learning Series: Automated Failover Behaviors (Denali - Logging History Information, FCI
and Default Health Capture, sp_server_diagnostics)
http://blogs.msdn.com/b/psssql/archive/2012/03/22/alwayson-hadron-learning-series-automated-failoverbehaviors-denali-logging-history-information-fci-and-default-health-capture-sp-server-diagnostics.aspx
Availability Modes
http://msdn.microsoft.com/en-us/library/ff877931.aspx
Replication, Change Tracking, Change Data Capture, and AlwaysOn Availability Groups (SQL Server)
http://msdn.microsoft.com/en-us/library/hh403414.aspx
Troubleshooting automatic failover problems in SQL Server 2012 AlwaysOn environments
http://support.microsoft.com/kb/2833707

AlwaysOn in SQL 2014

What's New (Database Engine) (section 'AlwaysOn enhancements')

http://msdn.microsoft.com/en-us/library/bb510411(v=sql.120).aspx

Powershell and AlwaysOn

Powershell optimization "-NoRefresh"

SMO optimizations with SetDefaultInitFields.
http://blogs.msdn.com/b/sqlalwayson/archive/2012/02/13/monitoring-alwayson-health-with-powershell-part2.aspx
http://blogs.msdn.com/b/mwories/archive/2005/04/22/smoperf1.aspx
http://blogs.msdn.com/b/mwories/archive/2005/04/22/smoperf2.aspx

View health policy (similar to dashboard information) through Powershell.

http://blogs.msdn.com/b/sqlalwayson/archive/2012/02/13/monitoring-alwayson-health-with-powershell-part1.aspx
http://blogs.msdn.com/b/sqlalwayson/archive/2012/02/13/monitoring-alwayson-health-with-powershell-part2.aspx
AvailabilityGroup Class
http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.management.smo.availabilitygroup

Windows Azure (IaaS) and AlwaysOn

GUI steps for AlwaysOn on Windows Azure (IaaS)

Powershell steps at MSDN links.
Blog has details about OS 2012 fix (KB 2854082) to enable Azure listeners, about listener mapped to public VIP
of cloud service, about load balancing options in Windows Azure.
http://blogs.msdn.com/b/sqlalwayson/archive/2013/01/23/test-lab-create-an-alwayson-availability-group-inwindows-azure-end-to-end.aspx
http://msdn.microsoft.com/en-us/library/jj870963.aspx
http://msdn.microsoft.com/en-us/library/dn376546.aspx
http://blogs.msdn.com/b/igorpag/archive/2013/09/02/sql-server-2012-alwayson-availability-group-andlistener-in-azure-vms-notes-details-and-recommendations.aspx
http://gallery.technet.microsoft.com/scriptcenter/Create-Availability-Group-fff94cd5 (not tested).
http://blogs.msdn.com/b/sqlalwayson/archive/2013/08/06/availability-group-listener-in-windows-azure-nowsupported-and-scripts-for-cloud-only-configuration.aspx (similar to the MSDN link steps).
Tutorial: AlwaysOn Availability Groups in Windows Azure
http://msdn.microsoft.com/en-us/library/jj870963.aspx
http://blogs.msdn.com/b/sqlalwayson/archive/2013/01/23/test-lab-create-an-alwayson-availability-group-inwindows-azure-end-to-end.aspx
http://social.technet.microsoft.com/wiki/contents/articles/14776.how-to-configure-windows-failover-cluster-inwindows-azure-for-alwayson-availability-groups.aspx
http://blogs.msdn.com/b/sqlalwayson/archive/2012/12/03/wsfc-cluster-configuration-on-windows-azurevirtual-machines.aspx
http://sqlcat.com/sqlcat/b/whitepapers/archive/2013/05/31/performance-guidance-for-sql-server-in-windowsazure-virtual-machines.aspx
Data Series: SQL Server in Windows Azure Virtual Machine vs. SQL Database
http://blogs.msdn.com/b/windowsazure/archive/2012/06/26/data-series-sql-server-in-windows-azure-virtualmachine-vs-sql-database.aspx
http://msdn.microsoft.com/library/windowsazure/dn133151.aspx
http://www.windowsazure.com/en-us/manage/windows/common-tasks/install-sql-server/

Section g - Similar technologies from other organizations

Please note these are third party links. These links are mentioned only as a reference, and do not indicate support
for the respective author's or organizations opinions.
http://joedantoni.wordpress.com/2012/03/29/alwayson-versus-active-data-guard-avalue-proposition/
http://dba.stackexchange.com/questions/20867/sql-server-equivalent-tofunctionality-of-oracle-rac
http://blogs.technet.com/b/sqlman/archive/2012/04/12/what-you-still-think-rac-isthe-answer-for-your-high-end-database-platform-requirements-even-with-sql-server2012-amp-sql-azure-now-released-time-to-think-again.aspx
http://sqlblog.com/blogs/linchi_shea/archive/2008/01/22/sql-server-always-on-andoracle-hard.aspx

Section h Data collection

Ideally troubleshooting information is available through below logs:

Executing TSQL etc. commands through SQL Management Studio.

SQL ERRORLOG files
Cluster logs, cluster validation report.
SQL default traces, system_health_* files (or Output of EXEC sp_server_diagnostics;),
AlwaysOn_health_*.xel (AlwaysOn) files in LOG folder
Event logs.

Section i Errors in SQL Server

Note: There are multiple messages that mention This is an informational message only. These are just
informational messages and generally no user action is required, hence most of these are not mentioned in the
following section. If there is a problem during occurrence of these messages, then look for additional messages in
logs.
error

severity

1480

Error message
The %S_MSG database "%.*ls" is changing roles from "%ls" to "%ls" because the mirroring
session or availability group failed over due to %S_MSG. This is an informational message
only. No user action is required.

Examples of informational messages: Error numbers 1480, 35202, 35262, 35265, 35266, 35267, 35299, 41048,
41049, 41051, 41053, 41054, 41055, 41058, 41059, 41061, 41075, 41076, 41089, 41092, 41093, 41094, 41097,
41098, 41099, 41110, 41111, 41137, 41139, 41141, 41145, 41147, 41149.

Error, Severity: Error message

Troubleshooting information
Error 976, Severity 14: The target database, '%.*ls', is participating in an availability group and is currently
not accessible for queries. Either data movement is suspended or the availability replica is not enabled for
read access. To allow read-only access to this and other.
If dm_hadr_availability_replica_states.role_desc=3(INVALID), then check service
broker. Because Availability group is using Service Broker to communicate between
SQL Server then need to check Service Broker first. Open SSMS>Server Objects >
Endpoints, there are two Service Broker folder>Expand both, and if the state is
Stopped or Disabled, then can change the state to Started.
Similar steps as in error 35250.

Error 983, Severity 14: Unable to access database '%.*ls' because its replica role is RESOLVING which does
not allow connections. Try the operation again later.

Check SQL ERRORLOGs, event logs, etc. for network, storage related messages etc.
Check for issues as in error 19406. Search for RESOLVING_NORMAL in this document.

Error 3402, Severity 10: The database '%ls' is marked %ls and is in a state that does not allow recovery to
be run.

If the second %ls is RESTORING, then one option is below.

1) Asked the customer to stop the SQL Server Service.
2) Had the customer rename the actual files on disk with a "_old" suffix so that the
sql server would not find them and have the db come up suspect.
3) Asked the customer to start the SQL Server Service.
Once the server came back up we were able to delete the database.
4) Had the customer create a new database with the same name and location as the

original database. Used the output from sysaltfiles to verify this.

5) Had the customer stop the SQL Server Service.
6) Had the customer rename these new database files by appending a "_new" on the end
of the file.
7) Had the customer take the old database files with the "_old" suffix and changed
them back to the original name.
8) Had the customer start the SQL Server Service.

Error 3633, Severity 16: The operating system returned the error '%ls' while attempting '%ls' on '%ls' at
'%hs'(%d).

Verify error in %ls at DOS prompt with net helpmsg. Then proceed depending on this
error.

Error 3752, Severity 16: The database '%.*ls' is currently joined to an availability group. Before you can
drop the database, you need to remove it from the availability group.

Before you can drop the database, you need to remove it from the availability group.
If AG exists, try ALTER AVAILABILITY GROUP [AGname] REMOVE DATABASE [DbName]. If AG
does not exist, try removing with ALTER DATABASE [DbName] SET HADR OFF.
If replica_id and group_database_id in sys.databases show as null, then this may be
due to earlier meta-data when database was part of an AG so stop SQL Server, rename
the database files and start SQL Server. Then you should be able to remove the
database.

Error 19405, Severity 16: Failed to create, join or add replica to availability group '%.*ls', because node
'%.*ls' is a possible owner for both replica '%.*ls' and '%.*ls'. If one replica is failover cluster instance,
remove the overlapped node from its possible owners and try ag

May occur if AG involves two or more SQL failover instances. Check possible owners
of SQL resource on both/more so as to verify no node name overlap. There should not
be node-overlap.
TSQL: SELECT * FROM sys.dm_os_cluster_nodes;
Powershell: get-clusterownernode -resource "XXXBSQL"
Powershell: set -clusterownernode -resource "XXXASQL" -owners NODE1,NODE2

Error 19406, Severity 10: The state of the local availability replica in availability group '%.*ls' has changed
from '%ls' to '%ls'. The replica state changed because of either a startup, a failover, a communication issue,
or a cluster error. For more information, see the availabi

If the changed to state is PRIMARY_PENDING, then check

sys.dm_hadr_database_replica_states. If database_state_desc = RECOVERY_PENDING
(synchronization_health_desc will be NOT_HEALTHY), then try ALTER DATABASE db SET
HADR RESUME;. Else if this is the only replica (no secondary replica), then
(consider first taking a database snapshot as a backup if required) try ALTER
DATABASE db SET HADR OFF; to remove AlwaysOn so as to then manually recover
database with SQL service or database restart (ALTER DATABASE ONLINE).
TSQL: ALTER DATABASE DbName SET HADR RESUME;
TSQL: ALTER DATABASE DbName SET ONLINE;
TSQL: RESTORE DATABASE DbName WITH RECOVERY;
If the changed to state is RESOLVING_NORMAL, check for additional messages. Also
check for issues as in http://support.microsoft.com/kb/2833707 (Troubleshooting

automatic failover problems in SQL Server 2012 AlwaysOn environments).

If change to state is PRIMARY_NORMAL / SECONDARY_NORMAL, then this may this
indicates successful failover. Check for additional messages if failover was not
expected.

Error 19456, Severity 16: None of the IP addresses configured for the availability group listener can be
hosted by the server '%.*ls'. Either configure a public cluster network on which one of the specified IP
addresses can be hosted, or add another listener IP address which can b

Add a new IP address (with different subnet) manually to the existing Listener.

Error 19471, Severity 16: The WSFC cluster could not bring the Network Name resource with DNS name
'%ls' online. The DNS name may have been taken or have a conflict with existing name services, or the
WSFC cluster service may not be running or may be inaccessible. Use a different

Verify cluster name has "Create Computer objects" permission in AD. Steps in error
1193.
Check for associated errors like error 19476.
Below steps may be an option.
1. Temporarily move the CNO account into the Computers container [optional].
2. Log into one of the cluster nodes with a domain account that had the Reset
Password right in the domain
3. Take the cluster Network Name offline. If this does not work to solve the
problem, simulate failures for the cluster Network Name resource until it is in a
permanent failed state
4. Once the resource was in an offline state, right-click on the resource, choose
More Actions and then click Repair
5. The previous action caused the password for the CNO to be reset in the domain
6. Restart the cluster service on all nodes.

Error 19476, Severity 16: The attempt to create the network name and IP address for the listener failed. The
WSFC service may not be running or may be inaccessible in its current state, or the values provided for the
network name and IP address may be incorrect. Check the state of the WFSC cluster and validate the
network name and IP address with the network administrator.

Check for associated errors like error 41009.

Error 35206, Severity 10: A connection timeout has occurred on a previously established connection to
availability replica '%ls' with id [%ls]. Either a networking or a firewall issue exists or the availability replica
has transitioned to the resolving role.

If there is a login failed for domain\node$ account around same time, then it may be
because NT AUTHORITY\SYSTEM account lacked the required permissions to failover the
availability group. To failover the group to the other node, NT AUTHORITY\SYSTEM
must have permission to connect to SQL (CONNECT SQL), failover the availability
group (ALTER ANY AVAILABILITY GROUP) and execute sp_server_diagnostics (VIEW SERVER
STATE). The NT AUTHORITY\SYSTEM account is used to automatically execute
sp_server_diagnostics. Another option is to add the account to sysadmin role in SQL
Server.
CREATE LOGIN [NT AUTHORITY\SYSTEM] FROM WINDOWS WITH DEFAULT_DATABASE=[master],
DEFAULT_LANGUAGE=[us_english];

ALTER SERVER ROLE [sysadmin] ADD MEMBER [NT AUTHORITY\system];

Error 35240, Severity 16: Database '%.*ls' cannot be joined to or unjoined from availability group '%.*ls'.
This operation is not supported on the primary replica of the availability group.

TSQL to unjoin is ALTER DATABASE db SET HADR OFF. Note that failing over thus
making it a secondary, and then retrying ALTER DATABASE HADR is not an option,
because the alter database will then give error 921 that database has not been
recovered yet.
If this replica can be made a secondary, then fail over thus making it a secondary,
and then drop replica when in this secondary role. TSQL is ALTER AVAILABILITY GROUP
AgName REMOVE REPLICA ON ReplicaName;.
If no secondary replica for failover and getting this error on primary, then DROP
AVAILABILITY GROUP AgName. This will automatically drop listener. If do not want
listener dropped, then create a temporary AG adding same name, then try the DROP
AVAILABILITY GROUP.

Error 35250, Severity 16: The connection to the primary replica is not active. The command cannot be
processed.

Check SQL ERRORLOG to verify SQL is listening on the ports. If SQL is listening on
port, check if SQL is listening on all IPs (e.g. not Server is listening on [
192.192.192.192 <ipv4> 5022]) (should ideally be Server is listening on [ 'any'
<ipv4> 1433]). If not listening on any/all IPs for port, then two options. One
option is to recreate hadr_endpoint to listen on ALL ip. Second option is to enter
the replication adapter IP in DNS for the configured endpoint).
Check ping, telnet, nslookup to replica(s). If firewall exception is required, then
create new inbound rule for TCP port 5022.

Error 35264, Severity 10: AlwaysOn Availability Groups data movement for database '%.*ls' has been
suspended for the following reason: "%S_MSG" (Source ID %d; Source string: '%.*ls'). To resume data
movement on the database, you will need to resume the database manually. For infor

Link redirects to http://msdn.microsoft.com/en-us/library/ff877956.aspx and the

mention steps should be followed to resume data movement.
SSMS: AlwaysOn High Availability node>Availability Groups node>expand availability
group>Availability Databases node>right-click database>Resume Data Movement>OK.
TSQL on secondary: ALTER DATABASE database_name SET HADR RESUME;

Error 41009, Severity 16: The Windows Server Failover Clustering (WSFC) resource control API returned
error code %d. The WSFC service may not be running or may not be accessible in its current state, or the
specified arguments are invalid. For information about this error code,

Net helpmsg on the %d value, and proceed accordingly. May also be accompanied by
Microsoft-Windows-FailoverClustering messages in event log. Check cluster log.
If the error is in cluster in Windows Azure, then it may be because Windows Azure
currently does not support more than one IP address per VM, which is required by the

availability group listener to resolve to the availability group owner node. This is
documented in http://msdn.microsoft.com/en-us/library/jj870962.aspx (High
Availability and Disaster Recovery for SQL Server in Windows Azure Virtual
Machines).

Error 41015, Severity 16: Failed to obtain the Windows Server Failover Clustering (WSFC) node handle
(Error code %d). The WSFC service may not be running or may not be accessible in its current state, or the
specified cluster node name is invalid. For information about this erro

Check if Windows cluster service is running.

Net helpmsg on the %d value, and proceed accordingly.

Error 41042, Severity 16: The availability group '%.*ls' already exists. This error could be caused by a
previous failed CREATE AVAILABILITY GROUP or DROP AVAILABILITY GROUP operation. If the
availability group name you specified is correct, try dropping the availability group a

If AG still exists, DROP AVAILABILITY GROUP.

Open regedit, backup the entry from the old Availability Group that starts with
HADR_Ag_???? key, then delete this key [to be tested].
Try with new name for Availability Group.

Error 41050, Severity 10: AlwaysOn Availability Groups: Waiting for local Windows Server Failover
Clustering service to start. This is an informational message only. No user action is required.

Notice this error says cluster service not node.

Check services.msc if cluster service is started on that node. If not started, then
start the service.
This is an informational message only. If there is a problem, then look for
additional messages.

Error 41052, Severity 10: AlwaysOn Availability Groups: Waiting for local Windows Server Failover
Clustering node to start. This is an informational message only. No user action is required.

Notice this error says cluster node not service.

If node is paused in Failover Cluster Manager, resume it.
This is an informational message only. If there is a problem, then look for
additional messages.

Error 41066, Severity 16: Cannot bring the Windows Server Failover Clustering (WSFC) resource (ID '%.*ls')
online (Error code %d). The WSFC service may not be running or may not be accessible in its current state,
or the WSFC resource may not be in a state that could accept the r

In DOS prompt, run Net helpmsg on the %d value, and proceed accordingly.

Error 41074, Severity 10: AlwaysOn: The local replica of availability group '%.*ls' is preparing to transition to
the primary role in response to a request from the Windows Server Failover Clustering (WSFC)
cluster. This is an informational message only. No user action is required

This is an informational message only. If there is a problem, then look for

additional messages.

Error 41091, Severity 10: AlwaysOn: The local replica of availability group '%.*ls' is going offline because
either the lease expired or lease renewal failed. This is an informational message only. No user action is
required.

Lease expired event from the cluster. Lease cannot be renewed maybe because the SQL
Server was busy. Will need to checks default traces etc. General suggestions include
updating statistics, index rebuild, set max memory if not set.
Possible causes include loss of lease, possible network issues and
sp_server_diagnostic query timeout.
This is an informational message only. If there is a problem, then look for
additional messages.

Error 41095, Severity 10: AlwaysOn: Explicitly transitioning the state of the Windows Server Failover
Clustering (WSFC) resource that corresponds to availability group '%.*ls' to Failed. The resource state is not
consistent with the availability group state in the instance of SQL

Verify pre-requisite (for AlwaysOn) Windows fixes are installed.

Error 41131, Severity 10: Failed to bring availability group '%.*ls' online. The operation timed out. Verify
that the local Windows Server Failover Clustering (WSFC) node is online. Then verify that the availability
group resource exists in the WSFC cluster. If the problem persis

Check logs for additional errors.

Error 41142, Severity 16: The availability replica for availability group '%.*ls' on this instance of SQL Server
cannot become the primary replica. One or more databases are not synchronized or have not joined the
availability group, or the WSFC cluster was started in Force Quorum

Select availability group databases and select option to resume data movement. This
will work if cluster was not started in forced_quorum mode.
Check if *_FailoverCluster_health_XeLogs confirm that the cluster was started in
forced_quorum mode (forced_quorum True in
availability_replica_automatic_failover_validation event).
o When a cluster is started in forced_quorum mode (net.exe start clussvc
/forcequorum, OR Failover Cluster Manager>left pane>right-click Windows
cluster>choose Force Cluster Start>Confirm by clicking Yes.) on primary
replica, the availability group is required to be started by issuing command
ALTER AVAILABILITY GROUP <AGNAME> FORCE_FAILOVER_ALLOW_DATA_LOSS;. Because
this is the original primary, despite issuing force allow data loss, NO data
loss occurs
o When a cluster is started in forced_quorum mode on secondary replica, first
check sys.dm_hadr_database_replica_cluster_states.is_failover_ready on
secondary. If this value is 1, and be assured no data loss will occur. If this

value is not 1 (for example because its asynchronous commit), then can check
sys.dm_hadr_database_replica_states.last_commit_time, so as to know estimate
start time of data loss. The
sys.dm_hadr_database_replica_states.last_hardened_lsn value may be used to
consider other secondaries for failover. Then can initiate failover with ALTER
AVAILABILITY GROUP <AGNAME> FORCE_FAILOVER_ALLOW_DATA_LOSS; on secondary.
TSQL: SELECT database_name, is_failover_ready
FROM
sys.dm_hadr_database_replica_cluster_states
WHERE replica_id = (SELECT replica_id
FROM
sys.availability_replicas
WHERE replica_server_name = '<Replica server name>')

Otherwise, once all local secondary databases are joined /synchronized, a planned
manual failover can be performed to this secondary replica (without data loss).

Error 41152, Severity 16: Failed to create availability group '%.*ls'. The operation encountered SQL Server
error %d and has been rolled back. Check the SQL Server error log for more details. When the cause of
the error has been resolved, retry CREATE AVAILABILITY GROUP command

Check this document for error number mentioned in %d and proceed accordingly.
Check sys.messages for error number mentioned in %d and proceed accordingly

Error 41158, Severity 16: Failed to join local availability replica to availability group '%.*ls'. The operation
encountered SQL Server error %d and has been rolled back. Check the SQL Server error log for more
details. When the cause of the error has been resolved, retry the A

Check for SQL/Windows messages around same time.

Error 41160, Severity 16: Failed to designate the local availability replica of availability group '%.*ls' as the
primary replica. The operation encountered SQL Server error %d and has been terminated. Check the
preceding error and the SQL Server error log for more details about

Checks logs for additional messages.

Error 41172, Severity 16: An error occurred while dropping availability group '%.*ls' from Windows Server
Failover Clustering (WSFC) cluster and from the local metadata. The operation encountered SQL OS error
%d, and has been terminated. Verify that the specified availability gr

In DOS prompt, run Net helpmsg on the %d value, and proceed accordingly.

Error 41195, Severity 16: Availability group '%.*ls' failed to process the WSFC lease-renewal command. The
local availability replica lease is no longer valid to process the lease renewal command. Availability replica
lease expired. This is an informational message only. No user a

This is an informational message only. If there is a problem, then look for

additional messages.

Error 41406, Severity 16: The availability group is not ready for automatic failover. The primary replica and
a secondary replica are configured for automatic failover, however, the secondary replica is not ready for
an automatic failover. Possibly the secondary replica is unavail

Check for additional messages on secondary replica(s).

Error 41414, Severity 16: In this availability group, at least one secondary replica is not connected to the
primary replica. The connected state is DISCONNECTED.

Check connectivity as in error 35250.

Section j - Errors in user interface (SSCM, wizard, SSMS)

Errors in SQL Server Configuration Manager (SSCM)
Error message
Troubleshooting information
An error occurred while loading the AlwaysOn High Availability properties [return code: 0x80070005].
May get this error if don't have the appropriate permission. Try right click
SQL Server Configuration Manager and select Run as Administrator.

Unable to save the AlwaysOn High Availability settings [return code: 0x80041033].

Proceed depending on error code. This is generally a WMI code e.g. 0x80041033.

The AlwaysOn Availability Groups feature requires the x86(non-WOW) or x64 Enterprise Edition of
SQL Server 2012 (or later version) running on Windows Server 2008 (or later version) with WSFC hotfix
KB 2494036 installed. This SQL Server edition and/or Windows Server System does not meet one or
more of the requirements

Verify OS is Windows 2008 or later version.

If OS is Windows 2008 or Windows 2008 R2, install Windows hotfix KB 2494036, if
not already installed.

Unable to retrieve cluster name [return code: (0x800706d9)].

Verify cluster service is started.

One option may be to EVICT and ADD again that node.
Steps: Right click node in failover cluster manager > More Actions... > Evict.

Errors in New Availability Group wizard in SSMS

Error message
Troubleshooting information
The current WSFC cluster quorum vote configuration is not recommended for this availability group.
For more information, see the following topic in SQL Server Books Online:
http://go.microsoft.com/fwlink/?LinkId=224159.
When validating WSFC quorum vote configuration, the AlwaysOn Availability Group
Wizard shows a warning if any of the following conditions are true:
o The cluster node that hosts the primary replica does not have a vote.
If this vote is not required for that environment, then this message
may be ignored. Powershell command to check vote is Get-ClusterNode |
f1 Name, Nodeweight.
o A secondary replica is configured for automatic failover and its
cluster node does not have a vote. If this vote is not required for
that environment, then this message may be ignored.
o KB2494036 is not installed on all cluster nodes that host availability
replicas. This patch is required to add or remove votes for cluster
nodes in multi-site deployments. However, in single-site deployments,
it is usually not required and you may safely ignore the warning.
o While setting up the availability group in the wizard, you configured
a replica to asynchronous availability mode. This warning is only
reported in SQL 2012 RTM, not in SP1 of SQL 2012 RTM.
o While setting up the availability group in the wizard, you configured
a replica for manual failover mode. This warning is only reported in
SQL 2012 RTM, not in SP1 of SQL 2012 RTM.
Check the link. Redirects to http://msdn.microsoft.com/en-us/library/hh270280.aspx

Checking for the database files on the secondary replica resulted in an error.
(Microsoft.SqlServer.Management.HadrTasks)
-----------------------------Program Location:
at Microsoft.SqlServer.Management.Hadr.TestDatabaseFileExisting.DoWork()
at Microsoft.SqlServer.Management.TaskForms.SimpleWorkItem.Run()
===================================
The following database files already exist on the server instance that hosts secondary replica HAVM3\TEST1:
\\path\file.mdf
\\path\file_log.ldf
(Microsoft.SqlServer.Management.HadrTasks)
-----------------------------Program Location:
at Microsoft.SqlServer.Management.Hadr.TestDatabaseFileExisting.DoWork()

Check if database-file paths are identical on source and on destination servers.

If different, manual (rather than using AG wizard) restore may be required with
WITH MOVE syntax.

An error was encountered while modifying the quorum settings.

Your cluster quorum settings have not been changed.
There was an error configuring the file share witness '\\XXX\ABC'.
Unable to save property changes for 'File Share Witness'.
File share associated with file share witness resource cannot be hosted by this cluster or any of its
nodes

Run cluster validation report and ensure there are no errors.

The Endpoints tab lists at least one endpoint that uses only Windows Authentication. However, the
server instance might be running under a nondomain account. To use the listed endpoint, change the
corresponding SQL Server service account to a domain account. To continue using the nondomain
account, alter the endpoint to use a certificate. Do you want to use the listed endpoints?

Check SQL Server Configuration Manager to see if SQL startup account is domain
user.

-----------------------------TITLE: Microsoft SQL Server Management Studio

-----------------------------The primary server 'XXX' cannot write to '\\XXX\e$\mssql\data'.
(Microsoft.SqlServer.Management.HadrTasks)
-----------------------------ADDITIONAL INFORMATION:
Backup failed for Server 'XXX'. (Microsoft.SqlServer.SmoExtended)
For help, click:
http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&ProdVer=11.0.3000.0+((SQL11_PC
U_Main).1210191325+)&EvtSrc=Microsoft.SqlServer.Management.Smo.ExceptionTemplates.FailedOperationExceptionTe
xt&EvtID=Backup+Server&LinkId=20476
-----------------------------An exception occurred while executing a Transact-SQL statement or batch.
(Microsoft.SqlServer.ConnectionInfo)
-----------------------------Cannot open backup device '\\XXX\e$\mssql\data\HADR Test.bak'. Operating system error 5(Access is
denied.).
BACKUP DATABASE is terminating abnormally. (Microsoft SQL Server, Error: 3201)
For help, click:
http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&ProdVer=11.00.3000&EvtSrc
=MSSQLServer&EvtID=3201&LinkId=20476
-----------------------------BUTTONS:
OK
------------------------------

Check SQL startup account to see if its domain account.

Errors in SQL Management Studio (SSMS)

Error message
Troubleshooting information
SSMS shows Availability Group state as Resolving.
Check that cluster service is started on that node. Also check for issues as in
error 19406.

Back up database task backs up the database availability groups contained in the set, you receive a
warning message like the one below.
This backup type is not supported in the secondary this task will fail if executed on the secondary.

GoTo SQL Server Management Studio, in Object Explorer, expand the availability of
high-availability availability AlwaysOn group group properties. Note the
Backup preferences section.
Consider using the maintenance plan Wizard to create a backup job since it checks
the sys.fn_hadr_backup_is_preferred_replica function call and automatically
includes the following script logic. This logic is if the default backup the
current replica replica (the return value to 1) grants the COPY_ONLY option in
your backup and not backup default backup replicas. The maintenance plan Wizard
when you create a database backup using the scripts are automatically added.
IF (NOT sys.fn_hadr_backup_is_preferred_replica(@DBNAME))
BEGIN
Select 'This is not the preferred replica, exiting with success';
RETURN 0 This is a normal, expected condition, so the script returns success
END
BACKUP DATABASE @DBNAME TO DISK=<disk>
WITH COPY_ONLY;

AlwaysOn. 'ApplcationIntent=ReadOnly doesn't work for registered server.

This is a known issue. Consider specifying ApplicationIntent=ReadOnly everytime

you make connection to SQL server 2012.

The local node is not part of quorum and is therefore unable to process this operation. This may be due
to one of the following reasons:
o The local node is not able to communicate with the WSFC cluster.
o No quorum set across the WSFC cluster.
For more information on recovering from quorum loss, refer to SQL Server Books Online.
(Microsoft.SqlServer.Management.HadrTasks)

Start cluster service. If its not started due to quorum, last option may be start
without a quorum. After cluster is running, check cluster nodes/votes are
appropriate.

Section k - Additional Windows related errors

Errors in Windows cluster log
Error message
Troubleshooting information
[RES] Network Name <SPEPD_Tier2>: Unable to update password for computer account DCR-SPEPD
on DC \\ DC-04.xxxx.org, status 5.
[RHS] Online for resource SPEPD_Tier2 failed.
Status 5 is Access Denied
In Windows AD, check for CNO (cluster network object) permissions.
For example, below to grant the "create computer objects" to the cluster virtual
host (virtual cluster name) for the OU in Active Directory on "Computers" and
"DomainServer" OU that was created. [To be tested, steps in event 1193]
Instructions:
1) Log into the Domain Controller machine.
2) Select Start - Administrative Tools - Active Directory users and computer.
--> Active Directory users and computer screen displays.
3) Display-select advanced features.
4) Select the appropriate domain - Computers.
5) Verify that the cluster virtual hosts.
6) Select the appropriate domain - Computers - right click - Properties.
--> Computers Properties screen appears.
7) Security tab-click the Advanced button.
--> Advanced Security for Computers "screen appears.
8) If you have access permissions tab - cluster virtual host registration
registration no "Edit" button if you choose the "add" button.
+ Case "add": conducted a check in the "computer", click the "object type" button,
search for cluster virtual hosts.
--> [Permission entries on Computers] screen is displayed.
9) Target: Select this object and all child objects check the permission "create
computer objects" and click the "OK" button.
--> Returns to the advanced security for Computers "screen.
10) Click "OK" button after clicking "the apply" button.
--> Returns to the Computers Properties screen.
11) And click the "OK" button and exit the Active Directory users and computer
screen.
INFO [RES] Network Name: [NNLIB] NetUserAdd object aglisten11 on DC: \\DC.contoso.com, result:
8557
ERR [RES] Network Name: [NNLIB] Failed to create Computer Object aglisten11 in the Active Directory,
error 8557

net helpmsg 8557 says "Your computer could not be joined to the domain. You have
exceeded the maximum number of computer accounts you are allowed to create in this
domain. Contact your system administrator to have this limit reset or increased."

Cluster log has "[RES] SQL Server Availability Group: [hadrag] Run 'EXEC sp_server_diagnostics 10'
returns following information" "ERR [RES] SQL Server Availability Group: [hadrag] ODBC Error: [42000]
[Microsoft][SQL Server Native Client 11.0][SQL Server]The user does not have permission to perform
this action. (297)"

NT AUTHORITY\SYSTEM account lacked the required permissions to failover the

availability group. To failover the group to the other node, NT AUTHORITY\SYSTEM
must have permission to connect to SQL (CONNECT SQL), failover the availability
group (ALTER ANY AVAILABILITY GROUP) and execute sp_server_diagnostics (VIEW
SERVER STATE).
Please view error 35206 for additional details.

Lease renewal failed with timeout error

Check this document for lease related errors/troubleshooting. This may be a

performance issue.

ERR [RES] SQL Server Availability Group <WSDBMEMAVG>: [hadrag] Availability Group is not healthy
with given HealthCheckTimeout and FailureConditionLevel
ERR [RES] SQL Server Availability Group <WSDBMEMAVG>: [hadrag] Resource Alive result 0.
ERR [RES] SQL Server Availability Group: [hadrag] Failure detected, diagnostics heartbeat is lost

This may be a performance issue.

Errors in Windows event log

Event ID, Source: Error message

Troubleshooting information
Event ID 1193, Microsoft-Windows-FailoverClustering: Cluster network name resource '%1' failed to
create its associated computer object in domain '%2' for the following reason: %3.
The associated error code is: %5
Please work with your domain administrator to ensure that:
- The cluster identity '%4' can create computer objects. By default all computer objects are created in
the 'Computers' container; consult the domain administrator if this location has been changed.
- The quota for computer objects has not been reached.
- If there is an existing computer object, verify the Cluster Identity '%4' has 'Full Control' permission to
that computer object using the Active Directory Users and Computers tool.
If %5 says Access is denied, then verify Create Computer objects permission
for cluster network name resource (CNO).
Steps:
1. On a domain controller, click Start, click Administrative Tools, and then click
Active Directory Users and Computers. If the User Account Control dialog box
appears, confirm that the action it displays is what you want, and then click
Continue.
2. Expand the default Computers container or the folder in which the cluster name

account (the computer account for the cluster) is located. Computers is located in
Active Directory Users and Computers/domain node/Computers.
3. Examine the icon for the cluster name account. It must not have a downwardpointing arrow on it, that is, the account must not be disabled. If it appears to
be disabled, right-click it and look for the command Enable Account. If you see
the command, click it.
4. On the View menu, make sure that Advanced Features is selected.
When Advanced Features is selected, you can see the Security tab in the properties
of accounts (objects) in Active Directory Users and Computers.
5. Right-click the default Computers container or the folder in which the cluster
name account is located.
6. Click Properties.
7. On the Security tab, click Advanced.
8. In the list of accounts with permissions, click the cluster name account, and
then click Edit.
Note If the cluster name account is not listed, click Add and add it to the list.
9. For the cluster name account, ensure that Allow is selected for the """Create
Computer objects permission. """
10. Click OK until you have returned to the Active Directory Users and Computers
snap-in.

Event ID 1194, Microsoft-Windows-FailoverClustering: Cluster network name resource '%1' failed to

create its associated computer object in domain '%2' for the following reason: %3.
The text for the associated error code is: %4
Please work with your domain administrator to ensure that:
- The cluster identity '%5' can create computer objects. By default all computer objects are created in
the 'Computers' container; consult the domain administrator if this location has been changed.
- The quota for computer objects has not been reached.
- If there is an existing computer object, verify the Cluster Identity '%5' has 'Full Control' permission to
that computer object using the Active Directory Users and Computers tool.

Like in event 1193, verify Create Computer objects permission for cluster
network name resource (CNO).
Try creating Client Access Point (CAP) with same name and IP to try to reproduce
the issue outside SQL.
Try pre-staging the name in AD (also add node name $ to the ACL with full
control). Now re-try creating CAP. More details in
http://technet.microsoft.com/enus/library/cc731002(WS.10).aspx#BKMK_steps_precreating .
Run Cluster validation.

Windows messages through net helpmsg

Error (Error message obtained through net helpmsg at command prompt)
Troubleshooting information

Error 665 (The requested operation could not be completed due to a file system limitation)

If database has FILESTREAM/FileTable, then check if 8dot3 name creation is enabled

for applicable drive(s). If it is enabled, disable it.
DOS command1: fsutil 8dot3name query C:
DOS command2: fsutil 8dot3name set C: 1

Error 5057 (The cluster IP address is already in use.)

Checked if IP is already in use. May require a new/unused IP depending on what is

being done.
Check permission for cluster network object CNO.

Error 5942 (The resource failed to come online due to the failure of one or more provider resources.)

Check for additional messages in logs.

Depending on the resource type, this generally requires involvement of Windows
team.
For multi-subnet cluster, check the HostRecordTTL property for network name. If
its set to 1200 (20 minutes), suggested value is 300 (5 minutes).
Try to remove SQL from this issue. For example, if this is a network name or an IP
address resource, then try creating a Client Access Point with same name and IP if
possible, and see it comes online. If it does not come online, then this requires
investigation by Windows team.

Error 8557 (Your computer could not be joined to the domain. You have exceeded the maximum
number of computer accounts you are allowed to create in this domain. Contact your system
administrator to have this limit reset or increased.)

For example, when a listener is created, a Computer Object is automatically

created in Active Directory. Even after Listener is dropped, that Computer Object
may remain in Active Directory. The other thing to know is that by default,
authenticated (domain) users are limited to create 10 Computer Objects in the
domain. Once they hit the limit, attempting to create the 11th will fail with this
error. User needs permission to create more than 10 Computer Objects or, more
likely, user needs to engage the Windows network team to clean up these Computer
Objects.

Errors in Windows Management Instrumentation (WMI)

Error: Error message
Troubleshooting information
Error 0x80070005
May get this error if don't have the appropriate permission. Try right click SQL
Server Configuration Manager and select Run as Administrator.

Verify wbemtest queries are working against root/default, cimv2 and SQL
namespaces.

Error 0x80041033

WBEM_E_SHUTTING_DOWN 2147749939 (0x80041033) --> User has requested an operation

while WMI is in the process of shutting down.
At the same time of error, check if application event log has message "Windows
Management Instrumentation has stopped WMIPRVSE.EXE because a quota reached a
warning value. (Microsoft-Windows-WMI, event id 5612). If this message is
present, then this issue may be because WMIPrvse process is not able to create the
required number of handles, it may be shutting down before completing the entire
proce ss. Increase the maximum number of handles per host in WMI. Steps to
increase max number of handles per host:
o Go to Start--> Run and type wbemtest.exe.
o Click Connect.
o In the namespace text box type "root" (without quotes).
o Click Connect.
o Click Enum Instances
o In the Class Info dialog box enter Superclass Name as
"__ProviderHostQuotaConfiguration" (without quotes) and press OK. Note: the
Superclass name includes a double underscore at the front.
o In the Query Result window, double-click "__ProviderHostQuotaConfiguration=@"
o In the Object Editor window, double-click HandlesPerHost.
o In the Value dialog, type in 8192
o Click Save Property.
o Click Save Object.
o Close Wbemtest.
o Restart the Windows Management Instrumentation service.
Note: This change means: the WMIPrvse which was eligible to create 4096 handles at
the max will be able to make 8196 handles at the max only if required. Even if
something makes the WMIprvse host to leak handles it will not cross 8192 handles.
Handles consumed are proportional to amount of resources consumed; especially Pool
resources. But I dont see that being a problem unless, we have a large number of
WMIPrvse leaking handles all at the same time and the system does not have much
RAM/Processor. Hence there should not be any impact as such, since this is a 64
bit server and we have ample resources on the server. However, I would not
recommend changing the default values unless there is a similar demand or any
other reason to increase the default value.

MSSQL-AzureSQL DBA PDF
50% (2)
MSSQL-AzureSQL DBA PDF
12 pages
SSIS
No ratings yet
SSIS
326 pages
SQL Server Monitoring and Performance
No ratings yet
SQL Server Monitoring and Performance
148 pages
Microsoft SQL Server DBA Knowledge Refresh Links-1
100% (2)
Microsoft SQL Server DBA Knowledge Refresh Links-1
5 pages
SQL Server Management Studio Keyboard Shortcuts
100% (12)
SQL Server Management Studio Keyboard Shortcuts
7 pages
Flow Chart Changed Documentation
No ratings yet
Flow Chart Changed Documentation
1 page
Microsoft SQL 2016 Step by Step Two Node Cluster
No ratings yet
Microsoft SQL 2016 Step by Step Two Node Cluster
53 pages
SQL Server Architecture Explained
100% (1)
SQL Server Architecture Explained
22 pages
MSSQL - SQL Server DBA Survival Guide PDF
100% (1)
MSSQL - SQL Server DBA Survival Guide PDF
936 pages
IBM - SQL DBA Interview Questions - SQL DBA Interview Questions
100% (1)
IBM - SQL DBA Interview Questions - SQL DBA Interview Questions
2 pages
SQL Server Questions
100% (1)
SQL Server Questions
223 pages
Disaster Recovery Options For Microsoft SQL Server
No ratings yet
Disaster Recovery Options For Microsoft SQL Server
7 pages
Reed Solomon Codes
No ratings yet
Reed Solomon Codes
6 pages
SQL Server DBA Training
100% (1)
SQL Server DBA Training
6 pages
Planning SQL Server
No ratings yet
Planning SQL Server
75 pages
Resume Pratik Anjania
No ratings yet
Resume Pratik Anjania
4 pages
TopicwiseSQLDBA Interview Questions
No ratings yet
TopicwiseSQLDBA Interview Questions
65 pages
Assigning Server and Database Roles
No ratings yet
Assigning Server and Database Roles
25 pages
DBA Checklist
No ratings yet
DBA Checklist
6 pages
SQL Server Always On Compute Engine
No ratings yet
SQL Server Always On Compute Engine
58 pages
Microsoft SQL Server AlwaysOn Solutions Guide For High Availability and Disaster Recovery
100% (1)
Microsoft SQL Server AlwaysOn Solutions Guide For High Availability and Disaster Recovery
33 pages
SQL Server Backup and Recovery
No ratings yet
SQL Server Backup and Recovery
251 pages
SQL SERVER DBA Class Room Training
No ratings yet
SQL SERVER DBA Class Room Training
28 pages
Ms SQL Server 2016 Training DBA 2018 Contents
100% (2)
Ms SQL Server 2016 Training DBA 2018 Contents
16 pages
SQL Server Security
No ratings yet
SQL Server Security
27 pages
Sqldba Faq Scenarios
No ratings yet
Sqldba Faq Scenarios
27 pages
El Hinduismo
No ratings yet
El Hinduismo
3,953 pages
Kubernetes Basic To Advance End To End
100% (5)
Kubernetes Basic To Advance End To End
295 pages
SITRAIN Training For: Automation and Industrial Solutions
100% (1)
SITRAIN Training For: Automation and Industrial Solutions
41 pages
Structured Query Language (SQL)
100% (1)
Structured Query Language (SQL)
73 pages
SQL Server DBA Phone Interview Questions
No ratings yet
SQL Server DBA Phone Interview Questions
38 pages
SQL Server Architecture
No ratings yet
SQL Server Architecture
13 pages
Acer Extensa 5620-6419 Service Manual2
No ratings yet
Acer Extensa 5620-6419 Service Manual2
157 pages
Microsoft SQL Server Architecture
No ratings yet
Microsoft SQL Server Architecture
18 pages
SQL SERVER - Performance Monitoring Tool - SQLGrease
No ratings yet
SQL SERVER - Performance Monitoring Tool - SQLGrease
2 pages
Lab Manual Windows PowerShell Scripting
100% (1)
Lab Manual Windows PowerShell Scripting
21 pages
Database Mirroring Interview Questions
No ratings yet
Database Mirroring Interview Questions
5 pages
SQL Server Support DBA
No ratings yet
SQL Server Support DBA
24 pages
Azure DevOps Explained Get Started With Azure DevOps and Develop Your DevOps Practices
100% (3)
Azure DevOps Explained Get Started With Azure DevOps and Develop Your DevOps Practices
497 pages
SQL Server Clustering For Dummies
No ratings yet
SQL Server Clustering For Dummies
17 pages
MRP Clean Up Before Starting Planning Manager
No ratings yet
MRP Clean Up Before Starting Planning Manager
6 pages
AlwaysOn Availability Groups Creation
No ratings yet
AlwaysOn Availability Groups Creation
11 pages
KDSSG Center For Excellence: SQL Server 2005 DBA Installation Checklist
No ratings yet
KDSSG Center For Excellence: SQL Server 2005 DBA Installation Checklist
5 pages
All Netapp Print Ha
No ratings yet
All Netapp Print Ha
60 pages
AWS Certified Solutions Architect Professional Slides v6
100% (8)
AWS Certified Solutions Architect Professional Slides v6
823 pages
SQL 2008 Dba
100% (1)
SQL 2008 Dba
314 pages
Garbage Collection
No ratings yet
Garbage Collection
15 pages
CF-28 Service Repair Manual
No ratings yet
CF-28 Service Repair Manual
204 pages
Five Things That Fix Bad SQL Server Performance
No ratings yet
Five Things That Fix Bad SQL Server Performance
3 pages
Jon Bonso - AWS Certified Solutions Architect Associate SAA-C03-Tutorials Dojo (2022)
67% (9)
Jon Bonso - AWS Certified Solutions Architect Associate SAA-C03-Tutorials Dojo (2022)
288 pages
Free 3D Carton Box Template Packaging Design Software
No ratings yet
Free 3D Carton Box Template Packaging Design Software
33 pages
Ibm Error Report
No ratings yet
Ibm Error Report
856 pages
KeyShot Manual
No ratings yet
KeyShot Manual
47 pages
SDLC
No ratings yet
SDLC
19 pages
LCD & Keypad Interfacing
No ratings yet
LCD & Keypad Interfacing
28 pages
A Quick Guide To Base64
No ratings yet
A Quick Guide To Base64
9 pages
Apply Function To Each Element of Array - MATLAB Arrayfun - MathWorks India
No ratings yet
Apply Function To Each Element of Array - MATLAB Arrayfun - MathWorks India
4 pages
ADF Code Corner: 109. How-To Further Filter Detail Queries Based On A Condition in The Parent View Using ADF BC
No ratings yet
ADF Code Corner: 109. How-To Further Filter Detail Queries Based On A Condition in The Parent View Using ADF BC
7 pages
Dell Latitude PDF
No ratings yet
Dell Latitude PDF
8 pages
TGW-700 Quick Start en v15
No ratings yet
TGW-700 Quick Start en v15
8 pages
Running COMSOL in Parallel On Clusters - 1001 - Knowledge Base
No ratings yet
Running COMSOL in Parallel On Clusters - 1001 - Knowledge Base
4 pages
Administering A SQL Database Infrastructure
No ratings yet
Administering A SQL Database Infrastructure
3 pages
023 POL902.00 CB1Q3934en 01 ModBus
No ratings yet
023 POL902.00 CB1Q3934en 01 ModBus
12 pages
SQL Server DBA Interview Questions and Answers: Answer
No ratings yet
SQL Server DBA Interview Questions and Answers: Answer
13 pages
Log Shipping Best Practices
No ratings yet
Log Shipping Best Practices
22 pages
SQL Server DBA Training
No ratings yet
SQL Server DBA Training
6 pages
Smart Forms
100% (1)
Smart Forms
47 pages
AWS Cloud Practitioner Full Course
86% (14)
AWS Cloud Practitioner Full Course
246 pages
AWS Short Notes
100% (1)
AWS Short Notes
161 pages
02 Redbull Driver in The Loop Motionplatform Paper
No ratings yet
02 Redbull Driver in The Loop Motionplatform Paper
6 pages
SQL Server Clustering
No ratings yet
SQL Server Clustering
4 pages
Object S - TABU - DIS
No ratings yet
Object S - TABU - DIS
5 pages
NV200 Manual Set - Section 5
No ratings yet
NV200 Manual Set - Section 5
16 pages
Aws Solution Architect Associate Guide
100% (2)
Aws Solution Architect Associate Guide
43 pages
Ansible For Kubernetes PDF
100% (6)
Ansible For Kubernetes PDF
172 pages
MetalGearRising - Rev+21Tr-LNG - v1.2 - I.N.F.O
No ratings yet
MetalGearRising - Rev+21Tr-LNG - v1.2 - I.N.F.O
3 pages
AWS Certified Solutions Architect Associate (Jon Bonso and Adrian Formaran)
80% (5)
AWS Certified Solutions Architect Associate (Jon Bonso and Adrian Formaran)
236 pages
Terraform Dump
100% (1)
Terraform Dump
106 pages
Amazon AWS Certified Solutions Architect Associate PDF
100% (2)
Amazon AWS Certified Solutions Architect Associate PDF
369 pages
Terraform Full Course
75% (4)
Terraform Full Course
34 pages
Terraform Certified
100% (3)
Terraform Certified
121 pages
Kubernetes Practicals Ebook
67% (3)
Kubernetes Practicals Ebook
187 pages
Terraform Practice Guide
100% (14)
Terraform Practice Guide
109 pages
Tutorials Dojo Study Guide and Cheat Sheets AWS Certified Cloud Practitioner 2021 10 01 xrhf1w
100% (13)
Tutorials Dojo Study Guide and Cheat Sheets AWS Certified Cloud Practitioner 2021 10 01 xrhf1w
196 pages
Hands-On Kubernetes On Azure
100% (4)
Hands-On Kubernetes On Azure
330 pages
AWS Certified Solution Architect Associate Study Guide V1.0 Abdul Jaseem VP Release 30 Aug 2020
100% (6)
AWS Certified Solution Architect Associate Study Guide V1.0 Abdul Jaseem VP Release 30 Aug 2020
235 pages
Learn Kubernetes 5 Minutes at A Time
No ratings yet
Learn Kubernetes 5 Minutes at A Time
187 pages
Container Networking Docker Kubernetes
100% (8)
Container Networking Docker Kubernetes
72 pages
100 Days of Kubernetes
100% (4)
100 Days of Kubernetes
121 pages
AWS Course - All Slides
80% (10)
AWS Course - All Slides
879 pages
Kubernetes Tutorial
100% (11)
Kubernetes Tutorial
83 pages
AWS CSAA Practice-Questions DCT V08-Ambu0d
100% (11)
AWS CSAA Practice-Questions DCT V08-Ambu0d
411 pages
Terraform Interview Questions Guide
100% (3)
Terraform Interview Questions Guide
11 pages
Cka PDF
80% (5)
Cka PDF
58 pages
55 Junior SQL Server Dba Interview Questions
No ratings yet
55 Junior SQL Server Dba Interview Questions
6 pages
Kubernetes Made Easy
100% (4)
Kubernetes Made Easy
136 pages
Azure Devops Complete Ci CD Pipeline PDF
100% (8)
Azure Devops Complete Ci CD Pipeline PDF
109 pages
Aws Lambda Tutorial
86% (7)
Aws Lambda Tutorial
393 pages
AWS Services Overview
67% (3)
AWS Services Overview
5 pages
Mastering PL/SQL Through Illustrations: From Learning Fundamentals to Developing Efficient PL/SQL Blocks (English Edition)
From Everand
Mastering PL/SQL Through Illustrations: From Learning Fundamentals to Developing Efficient PL/SQL Blocks (English Edition)
B Chandra
No ratings yet
Getting Started with SQL Server 2014 Administration
From Everand
Getting Started with SQL Server 2014 Administration
Gethyn Ellis
No ratings yet
Oracle Data Guard 11gR2 Administration Beginner's Guide
From Everand
Oracle Data Guard 11gR2 Administration Beginner's Guide
Emre Baransel
No ratings yet
Oracle Database Mastery: Comprehensive Techniques for Advanced Application
From Everand
Oracle Database Mastery: Comprehensive Techniques for Advanced Application
Adam Jones
No ratings yet
Oracle Database 11g - Underground Advice for Database Administrators: Beyond the basics
From Everand
Oracle Database 11g - Underground Advice for Database Administrators: Beyond the basics
April C. Sims
No ratings yet
Instant PostgreSQL Backup and Restore How-to
From Everand
Instant PostgreSQL Backup and Restore How-to
Shaun Thomas
No ratings yet
My Part-Time Study Notes on Mssql Server
From Everand
My Part-Time Study Notes on Mssql Server
Morris Sebenzile Mntoninzi
No ratings yet
Oracle 19c AutoUpgrade Best Practices: A Step-by-step Expert-led Database Upgrade Guide to Oracle 19c Using AutoUpgrade Utility
From Everand
Oracle 19c AutoUpgrade Best Practices: A Step-by-step Expert-led Database Upgrade Guide to Oracle 19c Using AutoUpgrade Utility
Sambaiah Sammeta
No ratings yet
DB2 9.7 for Linux, UNIX, and Windows Database Administration: Certification Study Notes
From Everand
DB2 9.7 for Linux, UNIX, and Windows Database Administration: Certification Study Notes
Roger E. Sanders
5/5 (1)
Starting Database Administration: Oracle DBA
From Everand
Starting Database Administration: Oracle DBA
anuragbaruah84
3/5 (2)
Installing SQL Server 2012 Step by Step
From Everand
Installing SQL Server 2012 Step by Step
Stephen Thomas
No ratings yet
Microsoft SQL Server 2012 Administration: Real-World Skills for MCSA Certification and Beyond (Exams 70-461, 70-462, and 70-463)
From Everand
Microsoft SQL Server 2012 Administration: Real-World Skills for MCSA Certification and Beyond (Exams 70-461, 70-462, and 70-463)
Tom Carpenter
No ratings yet
Oracle SQL and PL/SQL
From Everand
Oracle SQL and PL/SQL
Niraj Gupta
4.5/5 (8)
SQL Server: Tips and Tricks - 2
From Everand
SQL Server: Tips and Tricks - 2
Priyanka Agarwal
4.5/5 (3)
SQL Server: Tips and Tricks - 1
From Everand
SQL Server: Tips and Tricks - 1
Priyanka Agarwal
5/5 (1)
Oracle Database Administration Interview Questions You'll Most Likely Be Asked: Job Interview Questions Series
From Everand
Oracle Database Administration Interview Questions You'll Most Likely Be Asked: Job Interview Questions Series
Vibrant Publishers
5/5 (1)
Oracle Ultimate DBA Interview Questions
From Everand
Oracle Ultimate DBA Interview Questions
equitypress
4.5/5 (3)
SQL Server Management Studio Second Edition
From Everand
SQL Server Management Studio Second Edition
Gerardus Blokdyk
1/5 (1)
Oracle Data Guard A Clear and Concise Reference
From Everand
Oracle Data Guard A Clear and Concise Reference
Gerardus Blokdyk
No ratings yet
Oracle Exadata Complete Self-Assessment Guide
From Everand
Oracle Exadata Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet

Troubleshooting SQL Server AlwaysOn

Uploaded by

Troubleshooting SQL Server AlwaysOn

Uploaded by

Troubleshooting SQL Server AlwaysOn

Copyright 2014 by Microsoft Corporation

Microsoft and the trademarks listed at http://www.microsoft.com/about/legal/en/us/IntellectualProperty/Trademarks/EN-US.aspx

Section a - Troubleshooting Applications

Application connectivity introduction.

(SQL Server Native

http://technet.microsoft.com/en-us/library/hh213417.aspx (Availability Group

Application connection string.

This is expected. Application should have connection retry logic.

Application hang after AlwaysOn group failover.

TSQL: CREATE LOGIN [LLL] WITH PASSWORD='dddd', DEFAULT_DATABASE=[master],

Section b - Troubleshooting Network

Section c - Troubleshooting Performance

Frequently Asked Questions (FAQs).

of default FCI logs (\LOG\*_SQLDIAG_*.xel).

-- Below to change number of ERRORLOG files.

a limitation of Windows 2008 R2 cluster. This issue does not occur in

Primary replica database becoming unresponsive

To specifically address the lease timeout, increase the availability group

Questions on secondary replicas. The function sys.fn_hadr_backup_is_preferred_replica returns 0 no matter

Linked server configuration with AlwaysOn listener.

FAIL_PAGE_ALLOCATION 1 in SQL ERRORLOG when using AlwaysOn Functionality.

Check the max and min server memory correctly.

Slow commit performance problem for replica in synchronous commit mode.

High HADR_WORK_QUEUE wait.

High HADR_LOGCAPTURE_WAIT wait.

Availability Group(s) with replicas on standalone instances.

Availability Group(s) with replica(s) on FCIs (Failover Cluster Instances).

steps (FCI rolling patching). Patching steps (basic):

Optimized downtime, more steps (Leverage AG failover patching). Patching

Move FCI2 from Node3 to Node4

SQL Server 2012 RTM

FIX: Error code 20598 when a failover operation occurs

Description of new features in SQL Server 2012 and SQL

Improved Metadata Discovery process performance in

FIX: Poor performance in SQL Server 2012 when you

run a SQL Server trace

FIX: High "log write waits" counter value on a SQL

FIX: Scheduler deadlock on AlwaysOn Availability Group

Related SQL Server, Operating System (OS) and additional fixes.

A hotfix is available to let you

.NET 3.5 SP1

causes a Windows Server 2008 R2

OS 2008 SP2, 2008 R2

Pre-required fixes are indicated in http://technet.microsoft.com/en-us/library/ff878487.aspx (Prerequisites,

Section f - Reference documents for working of AlwaysOn

Applications and AlwaysOn

Availability Group listener in SQL 2012

Availability Groups in SQL 2012

Script for SQL Agent job alert.

AlwaysOn in SQL 2014

What's New (Database Engine) (section 'AlwaysOn enhancements')

Powershell and AlwaysOn

Powershell optimization "-NoRefresh"

View health policy (similar to dashboard information) through Powershell.

Windows Azure (IaaS) and AlwaysOn

GUI steps for AlwaysOn on Windows Azure (IaaS)

Section g - Similar technologies from other organizations

Section h Data collection

Executing TSQL etc. commands through SQL Management Studio.

Section i Errors in SQL Server

Error, Severity: Error message

If the second %ls is RESTORING, then one option is below.

original database. Used the output from sysaltfiles to verify this.

If the changed to state is PRIMARY_PENDING, then check

automatic failover problems in SQL Server 2012 AlwaysOn environments).

Check for associated errors like error 41009.

ALTER SERVER ROLE [sysadmin] ADD MEMBER [NT AUTHORITY\system];

Link redirects to http://msdn.microsoft.com/en-us/library/ff877956.aspx and the

Check if Windows cluster service is running.

If AG still exists, DROP AVAILABILITY GROUP.

Notice this error says cluster service not node.

Notice this error says cluster node not service.

This is an informational message only. If there is a problem, then look for

Verify pre-requisite (for AlwaysOn) Windows fixes are installed.

Check logs for additional errors.

of default FCI logs (\LOG\_SQLDIAG_.xel).