Performance Diagnosis Using DMV's & DMF's in SQL 2005 & 2008
Performance Diagnosis Using DMV's & DMF's in SQL 2005 & 2008
Prepared for Administrators, Support Engineers, Administrators, Support Engineers, Customers,Field Engineer and ConsultantsConsultants Monday, 21 January 2008 Version .1
The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication and is subject to change at any time without notice to you. This document and its contents are provided AS IS without warranty of any kind, and should not be interpreted as an offer or commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. The descriptions of other companies products in this document, if any, are provided only as a convenience to you. Any such references should not be considered an endorsement or support by Microsoft. Microsoft cannot guarantee their accuracy, and the products may change over time. Also, the descriptions are intended as brief highlights to aid understanding, rather than as thorough coverage. For authoritative descriptions of these products, please consult their respective manufacturers. This deliverable is provided AS IS without warranty of any kind and MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, OR OTHERWISE. All trademarks are the property of their respective companies. 2005 Microsoft Corporation. All rights reserved. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. Page ii
Performance diagnosis using DMVs and DMFs in SQL Server 2005 & 2008, Administrators, Support Engineers, Customers,Field Engineer and Consultants, Version .1 Prepared by Nickson Dicson "61847472.doc" last modified on 21 Jan. 08, Rev 41
Page iii
Performance diagnosis using DMVs and DMFs in SQL Server 2005 & 2008, Prepared by Nickson Dicson last modified on 21st Dec 2008
Table of Contents
Introduction...............................................................................................................................................1 Purpose..................................................................................................................................................1 Audience.................................................................................................................................................1 Knowing and understanding DMVs and DMFs....................................................................................2 Retrieving DMVs definition.....................................................................................................................4 Categorization of DMVs..........................................................................................................................6 Causes of SQL Server slowness...........................................................................................................13 Querying DMVs and DMFs for performance diagnosis.....................................................................18 DMVs and DMFs introduced in SQL Server 2008.............................................................................31
Page iv
Performance diagnosis using DMVs and DMFs in SQL Server 2005 & 2008, Prepared by Nickson Dicson last modified on 21st Dec 2008
INTRODUCTION
Purpose
This document provides information on diagnosing and identifying the system bottleneck using DMVs and DMFs which can be used to analysing and troubleshooting SQL Server 2005 performance. It provides brief information and conceptual knowledge on cause of system bottleneck. I have further added SQL Server 2008 DMVs however complete and detailed queries for the additional SQL 2008 DMVs will be compiled in later versions.
Audience
This document is intended for SQL Server Support Engineer, Database Administrators, Technology solution specialists, consultants and Field Engineers involved in maintaining and troubleshooting performance in SQL Server 2005 and SQL Server 2008.
Page 1
"61847472.doc" Created by Nickson Dicson
KNOWING
AND UNDERSTANDING
DMVS
AND
DMFS
One among the common and the most serious problem faced in SQL Server deployed environment which gets the professionals on hot seat is the big P Performance. Troubleshooting performance is necessary however to troubleshoot one need to first know the cause and to diagnose the performance problem. SQL Server 2005 introduced lots of new development one among them was dynamic management view (DMV) and dynamic management function (DMF) which changed the entire approach of collecting system information and troubleshooting SQL Server performance. Dynamic management views and functions return server state information that can be used to monitor the health of a server instance, diagnose problems, and tune performance. There are two types of dynamic management views and functions: 1) Server-scoped dynamic management views and functions. These require VIEW SERVER STATE permission on the server. 2) Database-scoped dynamic management views and functions. These require VIEW DATABASE STATE permission on the database. In earlier version of SQL Server we used to use system tables (virtual tables) like sysprocesses and syslockinfo to view system state and locking information respectively. These tables were called "virtual" because the rows are never physically stored on disk but are materialized from underlying server data structures when a user queries the table. There were couple of other tables that were used however this could not give SQL Server or system information in much detail. With introduction of dynamic management view (DMV) or dynamic management function (DMF) in SQL Server the approach towards troubleshooting and collecting system information have changed. DMV's and DMF's exposed more information and accurate. The dynamic management views do not require any parameters whereas the dynamic management functions generally accept parameters to qualify what data to return but still return their data as a rowset. SQL Server virtual tables discussed above were based on a class named UTRowset. However SQL Server 2005 uses streaming table valued function (STVF) for developing DMV's which gave more flexibility and gather information from system table structure. In a nutshell In SQL 2005 some of the DMVs are written using the STVF framework while others are implemented using the older UTRowset framework.
Page 2
"61847472.doc" Created by Nickson Dicson
Page 3
"61847472.doc" Created by Nickson Dicson
RETRIEVING DMVS
DEFINITION
As we can retrieve the definition for user stored procedure we can also see the definition of DMV's, querying the global data structure. We can use the sp_helptext procedure to get the body of DMV's. Eg sp_helptext 'DMV'. You can also use the OBJECT_DEFINITION function for the same. SELECT OBJECT_DEFINITION(OBJECT_ID('sys.dm_os_memory_objects')) Definition]; AS [Object
Views that contain format "OPENROWSET (TABLE name)" are STVF-based views, while ones that contain "OPENROWSET(name)" are UTRowset-based views. Below are the example of "OPENROWSET(name)" and UTRowset-based views. CREATE VIEW sys.dm_clr_tasks AS SELECT * FROM OpenRowset(TABLE DM_CLR_TASKS)
Page 4
"61847472.doc" Created by Nickson Dicson
Page 5
"61847472.doc" Created by Nickson Dicson
CATEGORIZATION
OF
DMVS
SQL Server 2005 introduced some what around 80 plus DMVs and these are very well categorized on various SQL Server components associated with system and SQL Server engine etc. For instance SQLOS related DMVs are more concentrated on collecting system / engine related information which are alternatives to collecting dumps or process level information for in depth troubleshooting. Below is brief description on some of the DMVs associated to SQL Server and system. 1) SQLOS / Engine.
Sr no
1 2
DMV / DMF's
sys.dm_os_tasks sys.dm_os_workers
Description
A task represents a single batch of request. It maintains the context that is used to run the batch. This table contains a row for every worker in the system. Worker could be mapped to a thread or can be fiber. A task, representing a batch, is assigned a worker (one-to-one mapping). This association is kept until the task ends. After this, the worker is available to be assigned to a new task. A worker can either be a fiber or mapped to a thread that is used to run a request on a scheduler. In the case where lightweight pooling is turned off (default), there is a one-to-one mapping between worker and thread. This mapping is maintained until the worker is deallocated either due to memory pressure or if it has been idle for a long time. When lightweight pooling is turned on, there is a many-to-one mapping between workers to thread (Essentially many fibers (i.e. workers) could be run on an individual thread). Workers and threads are pooled so as to support a lot of connections (like a thread pool). This table contains information on waits encountered by executing threads. It replaces DBCC SQLPerf (WaitStats). There are three general types of waits: Resource Waits, Queue Waits and External Waits. This DMV describes the wait queue of tasks, which are waiting on some resource. It is a simultaneous representation of all wait-queues in the system. It shows miscellaneous set of useful information about the machine, resources available and consumed by SQL Server. This table is used internally to keep track of debug data like outstanding allocations.
sys.dm_os_threads
sys.dm_os_wait_stats
sys.dm_os_waiting_tasks
6 7
sys.dm_os_sysinfo sys.dm_os_stacks
Page 6
"61847472.doc" Created by Nickson Dicson
sys.dm_os_schedulers
This table contains one row per scheduler in the system. This is mapped to an individual processor in the system. If an affinity was specified, then the scheduler will always run on the particular processor. This table is primarily used to monitor the health of a scheduler or to identify run-away tasks. This table lists all ring buffers in SQL Server. Ring Buffer is a memory buffer that keeps history of a number of last records. Ring Buffer implementation utilizes circular memory buffer as a main storage for records. This view returns a row per performance counter maintained by the Server. Memory Clerks implement access to memory nodes interface to allocate memory. It also tracks the memory allocated using the clerk for diagnostics. Every component allocating substantial amount of memory needs to create its own memory clerk and allocate all its memory using the clerk interfaces. During startup components create their corresponding clerks. This dynamic management view has a row for each module loaded into the server address space. A latch is a light weight synchronization object used by various components within SQL Server. A latch wait occurs when a latch request can not be granted immediately because the latch is held by some other thread in a conflicting mode. Note that this DMV only tracks latch waits, it does not track latch requests that were granted immediately or failed without waiting. Specifically the num_of_requests only tracks the number of waits. This DMV can be used to narrow down or pinpoint the source of latch contention by examining the relative wait numbers and wait times for the different latch classes. SQL Server can have external components run inside its process. An example of an external component is an OLEDB provider used to run Distributed Queries. In order to keep track of the resources it uses, SQL exposes the concept of a Host that allows this component to allocate memory, run tasks, etc. inside of SQL Server. This enables SQL Server to keep track of resource consumption by these components and react accordingly. This DMV exposes all the currently registered hosts on a SQL Server and the amount of resources they have used. When clustering is enabled, the SQL Server instance can run on any of the nodes of the cluster that are designated as part of the SQL Server virtual server. Each of the rows in this view represents the name of a node in the virtual server.
sys.dm_os_ring_buffers
10 11
sys.dm_os_performance_counters sys.dm_os_memory_clerks
12 13
sys.dm_os_loaded_modules sys.dm_os_latch_stats
14
sys.dm_os_hosts
15
sys.dm_os_cluster_nodes
Page 7
"61847472.doc" Created by Nickson Dicson
16
sys.dm_os_buffer_descriptors
The DMV exposes all buffer pool buffer descriptors that are in use by a database on the server. Therefore, free or stolen pages will not be exposed. Also, pages that had errors when read will not be exposed. Pages in use by the resource database will be exposed.
Sr no
1
DMV / DMF's
sys.dm_exec_connections
Description
This DMV provides information about the connections established to this sql server on various databases by different users local/remote and the details of each connection. Contains one row per authenticated session on the SQL Server. Provides information about each request executing within SQL Server. This Dynamic Management Function (DMF) provides the text of the SQL statement given the sql_handle for that statement. SQL handles are direct hash maps of the sql texts and are rarely same for two different sql queries Returns the cached plan for a given plan_handle. Provides aggregate performance statistics for cached query plans. The view contains one row per query plan and the lifetime of the row is tied to the plan itself. When the plan is removed from the cache for whatever reason, the row is eliminated from dm_exec_query_stats. This view provides detailed statistics about the operation of the SQL Server query optimizer. It also provides optimization cost incurred on a query plan. This DMV provides information about the query execution plans that are cached by SQL server for faster query execution. This dynamic management function provides information about cursors opened on various databases and their details.
2 3 4
5 6
sys.dm_exec_query_plan sys.dm_exec_query_stats
7 8 9
3) IO Related.
Sr no
DMV / DMF's
Description
Page 8
sys.dm_io_pending_io_requests
This table contains a row for each pending I/O in the system. An I/O that has been pending for a while (io_pending == TRUE) could indicate a problem with either the OS or the I/O hardware. Returns I/O statistics for database files, including log files. When clustering is enabled, the nodes of a cluster require a shared disk array for storing data that may be accessed by a second node during failover. Each of the rows in this view represents a single disk of that shared disk array. This dynamic management view identifies the list of backup devices and the status of requests to mount them up for backing up.
2 3
sys.dm_io_virtual_file_stats sys.dm_io_cluster_shared_drives
sys.dm_io_backup_tapes
4) Transaction Related.
Sr no
1
DMV / DMF's
sys.dm_tran_locks
Description
This view contains information about currently active lock manager resources. This includes regular locks and lock related notification requests. Each row represents a currently active request to the lock manager that has either been granted or is waiting to be granted, i.e. the request is blocked by an already granted request. Each row in this table represents a per transaction per database object, known as XDES. Each XDES holds transaction information relevant to the given database. This gives transaction information for every session in SQL server. It returns a virtual table that displays all version records in the version store. This is an expensive DMV to run as it has to query the entire version store that can potentially be very large. Each versioned record is stored as binary data along with some tracking/status information. It returns a single row that displays state information of the transaction in the current session. It displays the tranaction sequence number of all active transaction at the time when the current snapshot transaction starts. If the current transaction is not running under snapshot isolation, it returns no rows. This is similar to dm_tran_transactions_snapshot except that it only shows the active transactions for the current snapshot. Each row in this table represents an active transaction in this instance.
Page 9
sys.dm_tran_database_transactions
3 4
sys.dm_tran_session_transactions sys.dm_tran_version_store
5 6
sys.dm_tran_current_transaction sys.dm_tran_current_snapshot
sys.dm_tran_active_transactions
5) Index Related.
Sr no
1
DMV / DMF's
sys.dm_db_index_usage_stats
Description
This view allows you to display usage counts for individual indexes, and the time of the last operation performed on individual indexes, for various kinds of operations. An index use is an individual seek, scan, lookup, or update on that index by one query execution. Information is reported both for operations caused by user-submitted queries, and operations caused by internally-generated queries, such as scans for gathering statistics. Returns page and row count information for every partition in the current database. Displays size and fragmentation information for the data and indexes of the specified table. For an index, one row of statistics can be returned for each level of the b-tree in each partition. For heaps, one row of statistics is returned for the allocation unit of each partition. For LOB data one row of statistics is returned for the allocation unit of each partition. For row-overflow data, one row of statistics is returned for each partition if any row-overflow data has been created. Returns various performance statistics for indexes and tables within a given database
2 3
sys.dm_db_partition_stats sys.dm_db_index_physical_stats
sys.dm_index_operational_stats
6) Service Broker.
Sr no
1 2 3 4
DMV / DMF's
sys.dm_broker_activated_tasks sys.dm_broker_connections sys.dm_broker_forwarded_messages sys.dm_broker_queue_monitors
Description
This Dynamic Management View (DMV) contains a row for each stored procedure activated by Service Broker. This dynamic management view contains one row for each Service Broker network connection. This dynamic management view contains a row for each Service Broker message that the SQL Server instance is in the process of forwarding. This dynamic management view contains a row for each queue monitor in the instance. A queue monitor manages activation for a queue.
7) Replication Related.
Page 10
"61847472.doc" Created by Nickson Dicson
Sr no
1
DMV / DMF's
sys.dm_repl_articles
Description
This Dynamic Management Function (DMF) provides information about different articles tables/SPs published / subscribed as a part of SQL server replication at sql command level. This Dynamic Management Function (DMF) provides information about different articles tables/SPs published / subscribed as a part of SQL server replication at object level.
sys.dm_repl_schemas
Sr no
1 2 3
DMV / DMF's
sys.dm_fts_active_catalogs sys.dm_fts_crawls sys.dm_fts_crawl_ranges
Description
This dynamic management view returns information on the full-text catalogs that have some population activity in progress on the server. This dynamic management view returns information about the full-text index populations currently in progress. This dynamic management view returns information about the specific ranges (sub-tasks) related to a full-text index population currently in progress. This dynamic management view returns information about memory buffers that belong to a specific memory pool that are used as a part of a full-text crawl or a full-text crawl range. This dynamic management view returns information about the memory pools used during a full-text crawl, or a full-text crawl range
sys.dm_fts_memory_buffers
sys.dm_fts_memory_pools
9) CLR
Sr no
DMV / DMF's
Description
Page 11
sys.dm_clr_tasks
Every SQL OS task starts a CLR Host task whenever it requires execution of managed code. This DMV shows information on all CLR tasks currently running. Application domain is a construct in the CLR that is the unit of isolation for an application. This dynamic management view has a row for each application domain in the server. This DMV exposes all the properties related to CLR integration. This DMV requires CLR to be enabled and initialized. CLR is initialized by execution of any managed routines, types or triggers. This dynamic management view has a row for each managed (CLR) user assembly loaded into the server address space.
sys.dm_clr_appdomains
sys.dm_clr_properties
sys.dm_clr_loaded_assemblies
Page 12
"61847472.doc" Created by Nickson Dicson
CAUSES
OF
SQL SERVER
SLOWNESS
SQL Server as an RDBMS the primary aim is to serve the request from client processes. SQL Server very intelligently makes uses of its own internal component and external system components in serving its request in the best optimal ways. However there are many factors which could impact the overall performance. Over a period of time the database size may grow, users may increase and the hardware performance starts degrading and the server may not be able to handle the increase in load. SQL Server could also perform slowly if there is contention in tempdb which is used by various queries and slow running queries etc. Below are some of the main causes of slow performance. 1) System resource. 2) Slow query / plan. 3) Tempdb contention. 4) Blocking / Deadlocks.
1) System Resource Every process executed through SQL Server for its completion requires a system resource engagement. And these resources are CPU, IO, memory and network etc. These resource forms an integral part of every system. SQL making use of these resources can be used optimally or underused due to bad planning or application design. You might find that the problem is a resource that is running near capacity and that SQL Server cannot support the workload in its current configuration. To address this issue, you may need to add more processing power, memory, or increase the bandwidth of your I/O or network channel. But before going for high level machine or additional resource it is important to diagnose whether there is contention /Bad performance in resource or some configuration problem. Below are some of the points causing high CPU conditions. a) Statistics are not updated. (Or not updated with FullScan after Bulk Load). This can cause the queries to choose wrong execution plan. For example, Nested Loop join can be replaced by Hash join and will cause CPU spike. b) Table Variables are used in queries which have large number of rows and used in joins with Regular Tables. Table variables do not maintain statistics and can lead to bad execution plan. c) Some stored procedure which uses Parameters. Depending upon parameters, the count of retrieved rows can be very different. For example, the query which retrieves the data of 4 months or 2 months based on parameter. These queries need different execution plan and if they use execution plan which is already in memory, they can cause CPU to spike due to re-compilation.
Page 13
"61847472.doc" Created by Nickson Dicson
d) When the large query is rolled back, it needs to rollback all the tasks which are done and can cause CPU to go high.
2) Slow query / Plan. A query executed against SQL Server goes through different phase (parsing, normalization compilation, and optimization) till it reaches its actual execution. During this process a query plan is generated and SQL Server tries to generate the most optimal plan for faster response time. Fastest response time doesnt necessarily mean minimizing the amount of I/O that is used, nor does it necessarily mean using the least amount of CPUit is a balance of the various resources. These query plan generated could be resource intensive based on the type of joins finalized. The Hash operator and Sort operator scan through their respective input data. With read ahead being used during such a scan, the pages are almost always available in the buffer cache before the page is needed by the operator. Thus, waits for physical I/O are minimized or eliminated. When these types of operations are no longer constrained by physical I/O, they tend to manifest themselves by high CPU consumption. Nested loops can be Disk IO bound as per its working it needs to scan through the inner table for its match in outer table. Nested loop joins have many index lookups and can quickly become I/O bound if the index lookups are traversing to many different parts of the table so that the pages cant fit into the buffer cache. Query plan high depends on cardinality estimates for generating a better plan if statistics are not up to date this could result in inaccurate cardinality estimate causing query optimizer to get wrong input finally a non optimal plan causing slow query execution. As a part of diagnosis we can check DBCC show plan for EstimateRows and EstimateExecutions attributes.
3) Tempdb contention. Tempdb plays an important role in SQL Server. It allows creation of temporary tables, user objects and internal SQL tables too, it also allows sorting operation in it. Application can run huge DDL and DML operation against tempdb database. Tempdb normally faces contention due to high SQL Server and application DML/DDL activities and disk space issues. Normally user activities cause tempdb space issues resulting in SQL Server slowness and application task failure. Below table gives detailed information on space utilization in tempdb database. User objects These are explicitly created by user sessions and are tracked in system catalog. They include the following: Table and index. Global temporary table (##t1) and index.
Page 14
"61847472.doc" Created by Nickson Dicson
Local temporary table (#t1) and index. Session scoped. Stored procedure scoped in which it was created. Session scoped. Stored procedure scoped in which it was created.
Internal objects
These are statement scoped objects that are created and destroyed by SQL Server to process queries. These are not tracked in the system catalog. They include the following: Work file (hash join) Sort run Work table (cursor, spool and temporary large object data type (LOB) storage)
As an optimization, when a work table is dropped, one IAM page and an extent is saved to be used with a new work table. There are two exceptions; the temporary LOB storage is batch scoped and cursor worktable is session scoped.
Version Store
This is used for storing row versions. MARS, online index, triggers and snapshot-based isolation levels are based on row versioning. This is new in SQL Server 2005. This represents the disk space that is available in tempdb.
Free Space
It is important to constantly monitor tempdb activities and maintain sufficient space on disk having tempdb database files. Preferably for better performance and I/O its recommend to place tempdb on a dedicated disk drive. Having tempdb on disk having other database files could also result in used database failure.
4) Blocking/Deadlocks. Blocking is a common cause of SQL Server slow response. Blocking normally happens due to improper application design / database design, ad-hoc queries running against SQL Server or long running queries. Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type. This forces the second connection to wait, blocked on the first.
Page 15
"61847472.doc" Created by Nickson Dicson
Most blocking problems happen because a single process holds locks for an extended period of time, causing a chain of blocked processes, all waiting on other processes for locks. Common blocking scenarios include: Submitting queries with long execution times. A long-running query can block other queries. For example, a DELETE or UPDATE operation that affects many rows can acquire many locks that, whether or not they escalate to a table lock, block other queries. For this reason, you generally do not want to intermix long-running decision support queries and online transaction processing (OLTP) queries on the same database. The solution is to look for ways to optimize the query, by changing indexes, breaking a large, complex query into simpler queries, or running the query during off hours or on a separate computer. One reason queries can be long-running, and hence cause blocking, is if they inappropriately use cursors. Cursors can be a convenient method for navigating through a result set, but using them may be slower than set-oriented queries. Cancelling queries that were not committed or rolled back. This can happen if the application cancels a query; for example, using the Open Database Connectivity (ODBC) sqlcancel function without also issuing the required number of ROLLBACK and COMMIT statements. Cancelling the query does not automatically roll back or commit the transaction. All locks acquired within the transaction are retained after the query is cancelled. Applications must properly manage transaction nesting levels by committing or rolling back cancelled transactions. Applications that are not processing all results to completion. After sending a query to the server, all applications must immediately fetch all result rows to completion. If an application does not fetch all result rows, locks may be left on the tables, blocking other users. If you are using an application that transparently submits Transact-SQL statements to the server, the application must fetch all result rows. If it does not (and if it cannot be configured to do so), you may be unable to resolve the blocking problem. To avoid the problem, you can restrict these applications to a reporting or decision-support database. SQL Server serves the application base don the request received. The client application has almost total control over (and responsibility for) the locks acquired on the server. Although the SQL Server lock manager automatically uses locks to protect transactions, this is directly instigated by the query type sent from the client application and the way the results are processed. Therefore, resolution of most blocking problems involves inspecting the client application. A blocking problem frequently requires both the inspection of the exact SQL statements submitted by the application and the exact behaviour of the application regarding connection management, processing of all result rows, and so on.
Page 16
"61847472.doc" Created by Nickson Dicson
Page 17
"61847472.doc" Created by Nickson Dicson
QUERYING DMVS
AND
DMFS
As discussed above in the previous topic regarding the causes of SQL Server slowness. SQL Server 2005 and 2008 provides with lots of DMVs and DMVs for diagnosing the performance problem and identifying the resource utilization and contention which can prove to be an important tool for troubleshooting the performance problem. DMVs help in collecting the system state and in-depth information eliminating the need of collecting dumps to certain extent. I have categorized the queries collecting specific components related information based on the causes mentioned above for clarity. 1) System resource a) CPU This section has queries for collecting diagnostic information for CPU bottleneck. 1) One of the causes of high CPU is cost of compilation / Optimization. Below is the DMV and query which can be used to find the cost of optimization. sys.dm_exec_query_optimizer_info Using sys.dm_exec_query_optimizer_info, you can get a good idea of time spent optimizing. If we take 2 snapshots of this DMV, you can get a good feel of the time spent in optimizing for a given time period. Select * from sys.dm_exec_query_optimizer_info Or Select * from sys.dm_exec_query_optimizer_info where counter = 'elapsed time' In the above query we would look at the Elapsed time which is the time period elapsed due to optimizations. Since the optimization process is very CPU bound. We can get a good measure of how much of compile and Recompile time is contributing to the high CPU problem.
2) The query below can be used to retrieve query performing Sort and has operation. Sorts and hash Joins consumes high CPU. Note: - The query will itself appear in the output. select * from sys.dm_exec_cached_plans cross apply sys.dm_exec_query_plan(plan_handle) where cast(query_plan as nvarchar(max)) like '%Sort%' or cast(query_plan as nvarchar(max)) like '%Hash Match%'
Page 18
"61847472.doc" Created by Nickson Dicson
3) You can also monitor the SQL Server schedulers using the sys.dm_os_schedulers view to see if the number of runnable tasks is typically nonzero. A nonzero value indicates that tasks have to wait for their time slice to run, and thus high values for this counter may be indicative of a CPU bottleneck. select scheduler_id, current_tasks_count, runnable_tasks_count from sys.dm_os_schedulers where scheduler_id < 255
4) The following query will give you a high level view of which currently cached batches or procedures are using the most CPU. The query aggregates the CPU consumed by all statements with the same plan_handle (meaning that they are part of the same batch or procedure). If a given plan_handle has more than one statement you may have to drill in further to find the specific query that is the largest contributor to the overall CPU usage. select top 50 sum(qs.total_worker_time) as total_cpu_time, sum(qs.execution_count) as total_execution_count, count(*) as number_of_statements, qs.plan_handle from sys.dm_exec_query_stats qs group by qs.plan_handle order by sum(qs.total_worker_time) desc 5) You can also use the below query for more detailed information on query taking high CPU and associated database and object. Select highest_cpu_queries.plan_handle, highest_cpu_queries.total_worker_time, q.dbid, q.objectid, q.number, q.encrypted, q.[text] from (select top 50 qs.plan_handle, qs.total_worker_time
Page 19
"61847472.doc" Created by Nickson Dicson
from sys.dm_exec_query_stats qs order by qs.total_worker_time desc) as highest_cpu_queries cross apply sys.dm_exec_sql_text(plan_handle) as q order by highest_cpu_queries.total_worker_time desc
6) The below query gives top 15 stored procedures that were recompiled (plan_generation_num value>1) select top 15 sql_text.text, sql_handle, plan_generation_num, execution_count, dbid, objectid from sys.dm_exec_query_stats a Cross apply sys.dm_exec_sql_text(sql_handle) as sql_text where plan_generation_num >1 order by plan_generation_num desc
Note: The output data for the DMV (sys.dm_exec_query_stats) is tied to the compiled plan and so its lifetime is based on the time the plan remains in cache. The table only contains plans for cached DML statements (SELECT, INSERT, UPDATE, DELETE). Non-DML operations (e.g., CREATE INDEX, BACKUP, TRUNCATE, etc) are never found in this table, nor are direct calls to CLR procedures/triggers. Thus sys.dm_exec_query_stats may not account for all the resources used and you may need to use detailed tracing to determine the resources consumed by a given operation.
Page 20
"61847472.doc" Created by Nickson Dicson
b) Memory This section has queries for collecting diagnostic information for Memory bottleneck. 1) Consumption of memory by internal SQL Server components can be assessed using sys.dm_os_memory_clerks DMV. A quick way to check the amount of memory consumed through the multi-page allocators is: select sum(multi_pages_kb) from sys.dm_os_memory_clerks OR select type, sum(multi_pages_kb) from sys.dm_os_memory_clerks where multi_pages_kb != 0 group by type A quick way to check the amount of memory consumed through the single-page allocators is: select sum(single_pages_kb) from sys.dm_os_memory_clerks OR select type, sum(single_pages_kb) single_pages_kb != 0 group by type from sys.dm_os_memory_clerks where
2) AWE allocated memory in SQL Server. select sum(awe_allocated_kb) / 1024 as [AWE allocated, Mb] from sys.dm_os_memory_clerks 3) Below query gives the system memory information, SQL Server memory and CPU details. select physical_memory_in_bytes / 1024 / 1024 as physical_memory_mb, virtual_memory_in_bytes / 1024 / 1024 as virtual_memory_mb, bpool_committed * 8 / 1024 as bpool_committed_mb,
Page 21
"61847472.doc" Created by Nickson Dicson
bpool_commit_target * 8 / 1024 as bpool_target_mb, bpool_visible * 8 / 1024 as bpool_visible_mb, cpu_count, hyperthread_ratio, scheduler_count from sys.dm_os_sys_info
4) Query retrieving information on memory consumption by extended stored procedures. select type, sum (single_pages_kb + multi_pages_kb + virtual_memory_committed_kb + awe_allocated_kb + shared_memory_committed_kb) KB from sys.dm_os_memory_clerks where type like '%SQLXP%' group by type
5) If SQL Server witnesses memory pressure on box. SQL Server tries to decommit its own memory and also tries to shrink cache. This explains that there is memory pressure on box. There could be Memory Cache pressure due to internal or external factor. Cache pressure could result in faster flushing out of cached data and this can be identified by speed of clock algorithm.Information about clock hands movements is exposed through dm_os_memory_cache_clock_hands DMV. Each cache entry has a separate row for internal and external clock hand. If you see increasing rounds_count and removed_all_rounds_count, then the server is under internal/external memory pressure depending on which hand is moving. select * from sys.dm_os_memory_cache_clock_hands where rounds_count > 0 and removed_all_rounds_count > 0
Page 22
"61847472.doc" Created by Nickson Dicson
6) In the above queries we can further make modification for more detailed information by including sys.dm_os_memory_cache_counters DMV. select distinct cc.cache_address, cc.name, cc.type, cc.single_pages_kb + cc.multi_pages_kb as total_kb, cc.single_pages_in_use_kb + cc.multi_pages_in_use_kb total_in_use_kb, cc.entries_count, cc.entries_in_use_count, ch.removed_all_rounds_count, ch.removed_last_round_count from sys.dm_os_memory_cache_counters cc join sys.dm_os_memory_cache_clock_hands ch on (cc.cache_address ch.cache_address) where ch.rounds_count > 0 and ch.removed_all_rounds_count > 0 order by total_kb desc
as
7) Using Ring Buffers to diagnose the memory pressure. Significant amount of diagnostic information can be obtained from ring buffers DMV - sys.dm_os_ring_buffers. Each ring buffer keeps record of last number of notifications of a certain kind. For example some of the ring buffer type are below: a) You can use information from resource monitor notification to identify memory state changes. Internally SQL Server has a framework that monitors different memory pressures. When memory state changes, resource monitor task generates a notification. This notification is used internally by the components to adjust their memory usage according to the memory state and it is exposed to the user through sys.dm_os_ring_buffers DMV select record, * from sys.dm_os_ring_buffers where ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR' b) This ring buffer will contain records indicating server out of memory conditions. This record will tell which operation has failed (commit, reserve or page allocation) and the amount of memory requested. select record, * from sys.dm_os_ring_buffers where ring_buffer_type = 'RING_BUFFER_OOM' c) This ring buffer will contain records indicating severe buffer pool failures, including buffer pool out of memory conditions. This record will tell what failure (FAIL_OOM, FAIL_MAP,
Page 23
"61847472.doc" Created by Nickson Dicson
FAIL_RESERVE_ADJUST, FAIL_LAZYWRITER_NO_BUFFERS) and the buffer pool status at the time. select record from sys.dm_os_ring_buffers 'RING_BUFFER_BUFFER_POOL' where ring_buffer_type =
C) I/O (Disk Subsystem). For Diagnosing IO bottleneck Perfmon counters are very commonly used however these are DMV's which can be used for better information on wait stats, latch waits. These latch waits account for the physical I/O waits when a page is accessed for reading or writing and it is not available in the buffer pool. When the page is not found in the buffer pool, an async I/O is posted and then the status of the IO is checked. If IO has already completed, the worker proceeds normally otherwise it waits on PAGEIOLATCH_EX or PAGEIOLATCH_SH depending upon the type of request.
1) The following DMV query can be used to find IO latch wait stats: select * from sys.dm_os_wait_stats select wait_type, waiting_tasks_count, wait_time_ms from sys.dm_os_wait_stats where wait_type like 'PAGEIOLATCH%' order by wait_type
Page 24
"61847472.doc" Created by Nickson Dicson
Sample output as below. wait_type -----------------------PAGEIOLATCH_DT PAGEIOLATCH_EX PAGEIOLATCH_KP PAGEIOLATCH_NL PAGEIOLATCH_SH PAGEIOLATCH_UP waiting_tasks_count wait_time_ms -------------------0 19 0 0 230 24 -------------------0 750 0 0 10281 1046
You can identify an IO problem if your waiting_task_counts and wait_time_ms deviate significantly from what you see normally. For this, it is important to get a baseline of performance counters and key DMV query outputs when your SQL Server is running smoothly. These wait_types can provide you with a good indication if your IO subsystem is experiencing a bottleneck but it does not provide you with any visibility on the physical disk(s) that are experiencing the problem.
2) You can use the following DMV query to find currently pending IO requests. You can execute this query periodically to check the health of IO subsystem and to isolate physical disk(s) that are involved in the IO bottlenecks. select database_id, file_id, io_stall,io_pending_ms_ticks from sys.dm_io_virtual_file_stats(NULL, NULL)t1, sys.dm_io_pending_io_requests as t2 where t1.file_handle = t2.io_handle Database_id ----------9 File_Id ------2 io_stall io_pending_ms_ticks
-------------------- -------------------10322 51
Page 25
9 7
2 1
10322 104511
51 48
A sample output above. It shows that on a given database, there are three pending IOs at this moment. You can use the database_id and file_id to find the physical disk the files are mapped to. The io_pending_ms_ticks represent the total time individual IOs are waiting in the pending queue.
3) The following DMV query can be used to find which batches/queries are generating most IOs select * from sys.dm_exec_query_stats The query below retrieves logical reads and writes which can be used to find the logical reads done mainly for query plan and the records accessed from cache. select top 5 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, Execution_count from sys.dm_exec_query_stats order by (total_logical_reads + total_logical_writes)/execution_count Desc
4) The query bellow retrieves physical reads which is actual IO done from the Disk subsystem and the records accessed from disk and not from memory cache. select top 5 (total_physical_reads/execution_count) as avg_logical_reads, Execution_count from sys.dm_exec_query_stats order by (total_physical_reads/execution_count) Desc
5) We can also include plan_handle column to retrieve the query plan for which there is high read/write values for further query tuning. Below is the query for the same. select dbid, objectid, query_plan from sys.dm_exec_query_plan (0x05000400A3B73E12B8C10705000000000000000000000000) Output as below: =============
Page 26
"61847472.doc" Created by Nickson Dicson
dbid objectid
query_plan -----------
--------------------------------------------------------------------------------------------------------------------
2) Blocking.
Page 27
"61847472.doc" Created by Nickson Dicson
Blocking can be broadly categorized into waits for logical locks such as wait to acquire an X lock on a resource or waits that results from lower level synchronization primitives such as latches. Blocking could result in slow execution of query and system pressure. Finding out waittype of blocked process can be helpful in analysing the cause and troubleshooting blocking. SQL Server 2005 provides more detailed and consistent wait information, reporting approximately 125 wait types compared with SQL Server 2000s 76 wait types. 1) The DMVs that provide this information range from sys.dm_os_wait_stats for overall and cumulative wait for SQL Server to the session specific sys.dm_os_wait_tasks that breakdown waits by session. This DMV provides details on the wait queue of tasks, which are waiting on some resource. select * from sys.dm_os_wait_stats select * from sys.dm_os_waiting_tasks
2) We can identify the blocked session and blocking session ID from the above query and can provide the session ID in the query below for getting information for blocked process. select session_id, wait_type,wait_duration_ms,blocking_session_id, resource_description from sys.dm_os_waiting_tasks where session_id=53 The above query will output the session id (53) being blocked by a blocking session id (blocking_session_id) and the lock wait time in milliseconds.
3) Once we get the above output we can further diagnose the blocking for more information about granted locks or waiting for locks, using the sys.dm_tran_locks DMV. This dmv in conjunction with sys.partitions will give us information like the lock mode and active session status. select request_session_id, resource_type, resource_database_id, (case resource_type WHEN 'OBJECT' then object_name(resource_associated_entity_id) WHEN 'DATABASE' then ' ' ELSE (select object_name(object_id) from sys.partitions where hobt_id=resource_associated_entity_id) END) as object_name, resource_description, request_mode, request_status from sys.dm_tran_locks
The above query will give us the output as sysprocesses in SQL 2000 however more detailed information can be retrieved using this DMV's.
Page 28
"61847472.doc" Created by Nickson Dicson
4) This query lists the top 10 waits in SQL Server. These waits are cumulative but you can reset them using dbcc sqlperf ([sys.dm_os_wait_stats], clear) select top 10 * from sys.dm_os_wait_stats order by wait_time_ms desc
3) Monitoring Worker threads and Schedulers 1) To view kernel and user mode time for thread with the dynamic management view (DMV) sys.dm_os_threads. When combined with sys.dm_os_workers and sys.dm_os_schedulers it is possible to see details that pertain to the system and scheduler utilization on an active server. SELECT kernel_time + usermode_time as TotalTime, * FROM sys.dm_os_threads ORDER BY kernel_time + usermode_time desc 2) The following query can be run against a live SQL Server 2005 installation to see the milliseconds that have elapsed since the scheduler last checked the timer list. SELECT yield_count, last_timer_activity, (SELECT ms_ticks from sys.dm_os_sys_info) - last_timer_activity AS MSSinceYield, * FROM sys.dm_os_schedulers WHERE is_online = 1 and is_idle <> 1 and scheduler_id < 255 The above query normally can be used during scheduler hung or 17883/ 17884 errors.
Page 29
"61847472.doc" Created by Nickson Dicson
Page 30
"61847472.doc" Created by Nickson Dicson
DMVS
AND
DMFS
INTRODUCED IN
Srno
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
DMV / DMF's
sys.dm_cdc_errors sys.dm_cdc_log_scan_sessions sys.dm_cryptographic_provider_properties sys.dm_database_encryption_keys sys.dm_db_mirroring_auto_page_repair sys.dm_db_mirroring_past_actions sys.dm_exec_query_memory_grants sys.dm_exec_query_resource_semaphors sys.dm_exec_query_transformation_stats sys.dm_filesream_oob_handles sys.dm_filesream_oob_requests sys.dm_os_memory_brokers sys.dm_os_memory_nodes sys.dm_os_nodes sys.dm_os_process_memory sys.dm_os_spinlock_stats sys.dm_os_sys_memory sys.dm_os_resource_governor_configuration sys.dm_os_resource_governor_resource_pools
Page 31
"61847472.doc" Created by Nickson Dicson
20 21 22 23 24 25 26 27 28 29 30
sys.dm_os_resource_governor_workload_groups sys.dm_tran_commit_table sys.dm_xe_map_values sys.dm_xe_object_columns sys.dm_xe_objects sys.dm_xe_packages sys.dm_xe_session_event_actions sys.dm_xe_session_events sys.dm_xe_session_object_columns sys.dm_xe_session_targets sys.dm_xe_sessions
Summary:======== From the above information and the queries DMVs definitely proves to be an important tool for diagnosing and troubleshooting system and performance problem. These DMV queries can be scheduled as a job in SQL Server to be executed in fixed interval or executing during performance problem and getting the output in text format. This document is more emphasised on SQL 2005 however I may include more of SQL 2008 in later versions.
Page 32
"61847472.doc" Created by Nickson Dicson
Page 33
"61847472.doc" Created by Nickson Dicson