0% found this document useful (0 votes)
29 views57 pages

SQL Server Enterprise Architect Summit: June 09-12, 2009 Shanghai, China

Uploaded by

ewin23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views57 pages

SQL Server Enterprise Architect Summit: June 09-12, 2009 Shanghai, China

Uploaded by

ewin23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

SQL Server

Enterprise Architect Summit

June 09-12, 2009


Shanghai, China
Query Optimization and Query
Tuning in SQL Server 2008

Howard Yin
Principal Program Manager
SQL Server Customer Advisory Team
Microsoft Corp.
Agenda
• SQLCAT Introduction
• Query Optimizer Architecture
• Query Plan Analysis
• Influencing Plan Selection
• Special Considerations
– Parameterized Queries
– Temporary Tables and Table Variables
– Parallel Queries
• References

| SQL Server Customer Advisory Team Page 3


Agenda
• SQLCAT Introduction
• Query Optimizer Architecture
• Statistics Used by Optimizer
• Query Plan Analysis
• Influencing Plan Selection
• Parameterized Queries
• References

| SQL Server Customer Advisory Team Page 9


SQL Server Relational Server
High-Level Architecture

Language Processing (Parse/Bind/View Expansion)

Backup / Restore, BCP, …)


Utilities (DBCC,
Query Optimization Statement/Batch
(Plan Generation, Index Execution
Metadata, Type system,

Selection, Join Ordering,


Expression Services

Statistics, Costing) Query Execution


(Query Operators,
Plan Cache Management Memory Grants,
Parallelism)
Storage Engine (Access Methods, Database Page
Cache, Locking, Transactions, …)
SQL-OS (Schedulers, Buffer Pool, Memory
Management, Synchronization Primitives, …)

SQL Relational Server


Microsoft Confidential
| SQL Server Customer Advisory Team 10
SQL Server Relational Server
High-Level Architecture

Language Processing (Parse/Bind/View Expansion)

Backup / Restore, BCP, …)


Utilities (DBCC,
Query Optimization Statement/Batch
Execution
Metadata, Type system,

(Plan Generation, Index


Selection, Join Ordering,
Expression Services

Query Execution
Statistics, Costing) (Query Operators,
Plan Cache Management Memory Grants,
Parallelism)
Storage Engine (Access Methods, Database Page
Cache, Locking, Transactions, …)
SQL-OS (Schedulers, Buffer Pool, Memory
Management, Synchronization Primitives, …)

SQL Relational Server


Microsoft Confidential
| SQL Server Customer Advisory Team 11
SQL Server Relational Server
This Talk – Query Optimization

Query Optimization
Language Processing (Parse/Bind/View Expansion)

Backup / Restore, BCP, …)


Utilities (DBCC,
Query Optimization Statement/Batch

(Plan Generation, Index (Plan Generation, Index Execution


Metadata, Type system,

Selection, Join Ordering,


Expression Services

Statistics, Costing) Query Execution

Selection, Join Ordering, (Query Operators,


Plan Cache Management Memory Grants,

Statistics, Costing)
Parallelism)
Storage Engine (Access Methods, Database Page
Cache, Locking, Transactions, …)
SQL-OS (Schedulers, Buffer Pool, Memory
Management, Synchronization Primitives, …)

SQL Relational Server


Microsoft Confidential
| SQL Server Customer Advisory Team 12
Query Optimization
• Query Optimization is cost based in
SQL Server
• Input to Query Optimization is
Operator Tree produced by
– Parsing SQL statement (Syntax)
– Semantic analysis and checking (aka
Algebrizing – this includes type derivation,
type checking, binding variables, table and
column names, etc.)
• Output is Query Plan

| SQL Server Customer Advisory Team


Query Optimization
Select *
From …

parse/bind
Query optimizer
Hash join
 Generate … Concat
Join
alternatives
… Unio … … …
n  Estimate cost
… … … Output

Input
Metadata Physical
 Indices Properties
 Statistics  Memory
 Constraints  Number of
 User cores

| SQL Server Customer Advisory Team


Relational Expressions
• SQL is the standard front-end language
• Query processing is based on relational
operator trees

• Query block
SELECT SUM(T.a) GroupBy T.c, sum(T.a) GroupBy T.c, sum(T.a)
FROM T, R
Filter (T.b=R.b and R.c = 5) Join (T.b=R.b)
WHERE T.b = R.b
AND R.c = 5 Cross product Filter (R.c = 5)
T
GROUP BY T.c
T R R

| SQL Server Customer Advisory Team


Tree Transformations
Filter (A.x = 5) GroupBy A.x, sum(A.y)

Join Join Join

A B A B A B

Join Join Hash-Join

Filter (A.x = 5)B GroupBy A.x, sum(A.y) B A B

A A

Simplification Exploration Implementation

| SQL Server Customer Advisory Team


Sets Of Transformations

• Join reordering • Star and Snowflake


• Outerjoins queries
• Subqueries • Join elimination
• Aggregation • Indexed views
• Union • Update plans
• Index plans • Constraint checking
• Large IN lists • Parallel plans
• Empty table • Remote queries
simplification • Partitioned tables
• Bitmap filters • …

| SQL Server Customer Advisory Team


Costing
• Based on
– Estimated intermediate and final result
size
– Detailed model of execution algorithm
(I/O, number of CPUs, memory)
• Estimation of result size
– Histograms on distribution of values for
base table columns (table statistics)
– Derived on each operation, e.g. filter, join

| SQL Server Customer Advisory Team


Multi-Stage Optimization
• Stages
– No-choice queries (trivial plan)
– Transaction processing queries (Phase 0)
– Complex query I (Phase 1)
– Complex query II (Phase 2)
• Parameters of a stage
– Entry condition
– Transformations allowed
– Termination condition

| SQL Server Customer Advisory Team


SQL Server Cost-based Query
Optimization Process
Cost-based optimization
simplify
T0 T1 Pool of alternatives
(input)

search(0) search(1) search(2)


use simplification
transformations

use exploration and


implementation
transformations, cost
alternatives T2
(output)

| SQL Server Customer Advisory Team


Optimization Phases
• Goal: don’t overspend on optimization
– E.g. don’t spend additional 1 second of optimization time to
shave off 0.5 second of run time
• Phase 1 is not entered if Phase 0 finds a plan with cost<0.2
• Phase 2 is not entered if Phase 1 finds a plan with cost<1.0
• Each additional phase has richer transformations and
considers additional plan types
– In Phase 0 only NL and Hash joins
– In Phase 1 initially only non-parallel plans are considered; if
the best plan so far found has cost >=5, Phase 1 is
repeated to find a parallel plan and the “cheaper” of the
parallel and non-parallel versions will advance to Phase 2
– Phase 2 is performing for example Outer Join reordering,
Index View matching and Star Query optimization
• Optimization may end up in “time-out” and the best so
far (if any!) plan is sent for execution

| SQL Server Customer Advisory Team


SQL Server - Optimizer
Extensibility
Goal: Support high paced innovation (typical for
MS products)
• Easy way to add new operators
– Logical operators, or execution algorithms
– Cardinality and cost estimation functions
• Adding new tree transformations
– Handle new operators
– Add new reordering alternatives
• Apply cost-based optimization uniformly on all
operations, beyond joins/SQL query block
| SQL Server Customer Advisory Team
Tools to monitor optimization
• SQL Server 2008 has tools to see “inside” the
optimization
• Showplan contains for example
(will be covered in depth next)
– Operator cost and subtree cost
– Cardinalities of all operators (estimated number
of rows returned by the operator)
– Optimizer warnings (e.g. missing join predicate)
– Values of variables used for optimization
• SQL Trace
• DMV sys.dm_exec_query_optimizer_info
• Perfmon counters in SQL Server: SQL Statistics

| SQL Server Customer Advisory Team Page 25


Agenda
• SQLCAT Introduction
• Query Optimizer Architecture
• Statistics Used by Optimizer
• Query Plan Analysis
• Influencing Plan Selection
• Parameterized Queries
• References

| SQL Server Customer Advisory Team Page 26


What affects optimizer choices?

• Optimizer Statistics
• Indexes and constraints
• Memory
• Logical number of processors

| SQL Server Customer Advisory Team Page 27


Optimizer Statistics
• Used for costing
– Histogram
• Captures distribution of values for a column
• Max number of steps = 200
• Used for estimating cardinality of search predicates
• Single-column only
– Density
• Number of occurrences of each unique value in a column
• Used for estimating cardinality of search predicates, joins, agg
– String index (statistics) – new in SQL Server 2005
• Viewing statistics
– DBCC SHOW_STATISTICS
• CREATE / UPDATE STATISTICS
– Can specify FULL SCAN or sample size or sample %
– Default sample size increases logarithmically with the number
of rows
– Sp_createstats, sp_updatestats
• By default SQL Server creates and maintains statistics
automatically – the Best Practice is to use the default
Microsoft Confidential
| SQL Server Customer Advisory Team 28
Auto Statistics
• Database options
– AUTO_CREATE_STATISTICS
• Stats on columns created when a query needs them
– AUTO_UPDATE_STATISTICS – when?
• Permanent tables: 500; 500 + 20% * rows in table when
compiled
• Temp tables: 6; 500; 500 + 20% * rows in table when
compiled
• See
http://www.microsoft.com/technet/prodtechnol/sql/2005/
qrystats.mspx for more details
– Beware AUTO UPDATE uses default sampling even if
created as FULLSCAN => consequence CREATE INDEX
creates statistics with FULLSCAN but subsequent auto
update for the same index will use sampling
– No auto-create or auto-update stats for table
variables (and you cannot create statistics either)
• sp_autostats
– Display, change auto-stats setting (ON/OFF)
– Can disable auto-stats per index, table

Microsoft Confidential
| SQL Server Customer Advisory Team 29
Statistics Issues
… and Best Practices
• Statistics are used to derive Cardinality Estimate (CE) on each query
operation
– Filter, Join, GroupBy, …
– Warning column in showplan if missing statistics
• Limitations
– CE gets progressively less reliable across multiple joins and group-by’s
– Hidden correlations tough to handle
• Data-level correlation: Two columns that are correlated
• Predicate-level correlation: Multiple predicates that are not independent
– Data skew impacts estimates, especially with sampled statistics
• Hard to correctly guess density from sample
• Recompilation if number of changes to a column exceeds thresholds
• (Re)Compilations requiring a statistic stall if auto_update stats is
executing
– Consider AUTO_UPDATE_STATISTICS_ASYNC Database option
• Best practices
– Reasons to turn off AUTO_UPDATE on a statistic
• If it is causing unnecessary recompilation - lots of updates but they don’t
change nature/distribution of data
• For increased predictability during work-day if can be scheduled during off
time
– Sometimes, FULLSCAN stats necessary – schedule periodic update stats
Microsoft Confidential
| SQL Server Customer Advisory Team 30
Upgrade Considerations
• Upgrade from SS 2000
– All SS 2000 stats are considered automatically
“stale” => autostats will take care of the
“upgrade”
– Best Practice is to update statistics manually as a
part of the upgrade step
– If stats are maintained manually they all should
be manually updated during upgrade
• sp_updatestats @resample = 'resample‘
– Details of changes are described in
http://www.microsoft.com/technet/prodtechnol
/sql/2005/qrystats.mspx
• Upgrade from SS 2005
– No need to update statistics

Microsoft Confidential
| SQL Server Customer Advisory Team 31
Agenda
• SQLCAT Introduction
• Query Optimizer Architecture
• Statistics Used by Optimizer
• Query Plan Analysis
• Influencing Plan Selection
• Parameterized Queries
• References

| SQL Server Customer Advisory Team Page 32


Query Plan aka Showplan
• 3 forms: textual, XML , graphical
• By contents: operators only, with cost and
cardinality estimates and with runtime information

Text XML Graphical


Operators SET SHOWPLAN_TEXT N/A N/A
ON
Operators and SET SHOWPLAN_ALL SET Display Estimated
estimated costs ON SHOWPLAN_XML Execution Plan in
and card. ON Management Studio
Run time info SET STATISTICS SET STATISTICS Include Actual
PROFILE ON XML ON Execution Plan in
Management Studio

• SQL Trace has 8 different Showplan events in


Performance event category
| SQL Server Customer Advisory Team Page 33
Demo – Showplan in SSMS
• Estimated Query Plan
– Properties of the Index Seek
– Estimated # of rows, # of executions
• Actual Query Plan
– Watch for “Actual” number of rows,
estimated number of rows, number of
executions
• SHOWPLAN_TEXT, ALL, XML
• Saving and sharing the query plans

| SQL Server Customer Advisory Team Page 34


Demo – Trivial Plan in SSMS
• Trivial plan
– By investigating the syntax and metadata
optimizer knows “there are no choices”
– Optimization Level = Trivial
• Non-trivial plan
– Optimization Level = FULL
– if early termination happens
• Reason For Early Termination has a value
– Good Enough Plan Found OR
– Timeout OR
– Memory Limit Exceeded

| SQL Server Customer Advisory Team Page 36


Using DMV
• sys.dm_exec_query_optimizer_info
captures aggregate counters of
various compile time events, for
example
– How many compilations happened so far?
– How many finished in which phase
(0,1,2,3) of optimization?
– How many Inserts, Deletes, Updates?
– Etc.

| SQL Server Customer Advisory Team Page 37


Demo - Optimizer Warnings
• SHOWPLAN shows warnings
• Missing Column Statistics
• Missing Join Predicate
• SQL Trace events
– Errors and Warnings Event Category
• Missing Column Statistics
• Missing Join Predicate

| SQL Server Customer Advisory Team Page 38


Demo – Database Tuning
Advisor (DTA)
• Input batch file or SQL trace file or table
• GUI interface surfaces only a subset of
capabilities of the DTA engine
– Command line version is using XML
document conforming to DTASchema.xsd
– SQL Server schemas are published at
http://schemas.microsoft.com/sqlserver/
• Recommended use for smaller sets of
queries (as was the demo)

| SQL Server Customer Advisory Team Page 39


Missing Indexes
• In some cases the Optimizer will record
information about “missing index” in
Showplan and in several DMVs
– sys.dm_db_missing_index_groups
– sys.dm_db_missing_index_group_stats
– sys.dm_db_missing_index_details
– sys.dm_db_missing_index_columns
• The information is opportunistic and not
exhaustive

| SQL Server Customer Advisory Team Page 40


Query Plan Debugging – best
practices
• Collect “Actual Execution Plan”
• Compare “Estimated” and “Actual” row counts
– Large discrepancies indicate potential problem spots
• Watch for Execution Warnings – BOL describes
remedies for each case
– Sort Warning
• sort operations do not fit into memory
– Exchange Spill
• buffers in a parallel query plan have been temporarily
written to the tempdb database
– Hash Warning
• hash recursion or cessation of hashing (hash bailout) has
occurred during a hashing operation
– Bitmap Warning
• Non-selective bitmap filter is used
| SQL Server Customer Advisory Team Page 41
Agenda
• SQLCAT Introduction
• Query Optimizer Architecture
• Statistics Used by Optimizer
• Query Plan Analysis
• Influencing Plan Selection
• Parameterized Queries
• References

| SQL Server Customer Advisory Team Page 42


Hints
• Types of Query hints
– Locking (PAGLOCK, ROWLOCK, NOLOCK,
HOLDLOCK…)
– Isolation Level (REPEATABLEREAD,
SERIALIZABLE ,…)
– Other (KEEPIDENTITY, KEEPDEFAULTS,
NOWAIT, NOEXPAND,…)
– Optimizer
• Table access
• Join
• Query
Microsoft Confidential
| SQL Server Customer Advisory Team 43
Optimizer Hint Classification
SELECT ProductName, Products.ProductID TABLE
access HINT
FROM
dbo.[Order Details]
INNER JOIN
dbo.Products JOIN HINT

ON [Order Details].ProductID = Products.ProductID


WHERE Products.UnitPrice > 100

QUERY
HINT

| SQL Server Customer Advisory Team 44


Optimizer Hint Examples
• Table access hint: INDEX (n) | (name)
– Cannot force seek versus scan until 2008
• Join order, method
– FORCE ORDER query hint
• But cannot specify order across sub-queries, or push
aggregations below joins – always QUERY level
– LOOP | MERGE | HASH JOIN – both at the query level
and individual join
• Aggregation
– HASH GROUP => Hash Aggregation
– ORDER GROUP => Stream Aggregation
• For more details see BOL
– TABLE HINT section for Table access hints
– Join Hints section for Join hints
– OPTION clause section for Query hints

Microsoft Confidential
| SQL Server Customer Advisory Team 45
Hints - New in 2005 and 2008
• RECOMPILE hint for individual queries
• The following hints provide more control to
“fix” the plan
– OPTIMIZE FOR hint to force particular values of
parameters to be used by the optimizer
– USE PLAN to enforce particular query plan
– PARAMETERIZATION hint
• Database level
• Inside a Plan Guide
– Plan Guides
• FORCESEEK – new table access hint in SQL
Server 2008

Microsoft Confidential
| SQL Server Customer Advisory Team 46
Recompile Hint
• In SS 2000: CREATE PROCEDURE WITH
RECOMPILE
• In SS 2005 and 2008, an individual
statement may have OPTION (RECOMPILE)
– Better alternative to dynamic SQL when
forcing recompilation for each new set of
variables was suggested in Shiloh
– Easier syntax
– No need to grant access to all referenced
objects; execute on the sp suffices

Microsoft Confidential
| SQL Server Customer Advisory Team 47
Optimize For Hint
OPTION ( OPTIMIZE FOR ( @variable_name
= literal_constant [ ,…n ] ) )

• Values are used only for the


optimization, not for execution
• Counteracts parameter detection – may
achieve more predictable behavior
• Cannot be used without data knowledge

Microsoft Confidential
| SQL Server Customer Advisory Team 48
Use Plan Hint
• Plan Stability Feature
• Idea: Use a captured XML Showplan in
a USE PLAN query hint to guide the
optimizer to choose the same plan

Microsoft Confidential
| SQL Server Customer Advisory Team 49
Plan Forcing
Usage: Query

SELECT * FROM t
OPTION (
USE PLAN N‘<ShowPlanXML> …</ShowPlanXML>’
)

Query Hint
String literal
of XML Showplan

Microsoft Confidential
| SQL Server Customer Advisory Team 50
What’s Forced by Plan Hint
• Plan topology and order of evaluation
• Execution algorithms
• Index solutions
• Objects referenced in the query

Plan Forcing is used to force cost


based decisions.

Microsoft Confidential
| SQL Server Customer Advisory Team 51
Plan Hint Limitations
• No INSERT/UPDATE/DELETE in 2005,
YES in 2008
• No Distributed Query
• No Full-text queries
• Static, Fast_Forward cursors only

Microsoft Confidential
| SQL Server Customer Advisory Team 52
Plan Guides - Idea
• Out-of-band query hinting
• Scenario: query is known but cannot be modified
directly in the source code yet some hinting is
desirable
• Replace the Original Query by the hinted version
using “Plan Guide” system table
Original Query Hinted Version

SELECT … SELECT … OPTION (…)

UPDATE … UPDATE … OPTION (…)

Microsoft Confidential
| SQL Server Customer Advisory Team 53
Plan Guides Feature Overview
• Create: sp_create_plan_guide
• Drop/Enable/Disable:
sp_control_plan_guide
• Matching:
– During module (SP, DML trigger, multi-
statement TVF, scalar UDF) compilation
– During batch compilation
– Targets
• original user query
• Simple- or Force-parameterized query

Microsoft Confidential
| SQL Server Customer Advisory Team 54
Scenarios (all pertain to an app you
can’t change)
• Force use of a MERGE JOIN operation for a
specific query in a user-submitted batch
• Force optimizer to always use “worst case”
parameter value
• Force compilation sharing for queries with a
format that is the same except constant
literals are different
– Specify FORCED PARAMETERIZATION in plan
guide
• For use of a specific plan for a manually
parameterized query
Microsoft Confidential
| SQL Server Customer Advisory Team 55
New around plan guides in
2008
• Plan Freezing - ability to create a plan from the
plan cache directly
– The plan guide is automatically created by invoking
sp_create_plan_guide_from_cache
• See documentation for the sp for boilerplate code
• sys.fn_validate_plan_guide function validates if
the plan guide is still valid (after e.g. some
metadata changes)
• SQL Profiler trace events "Plan_guide_successful"
and "Plan_guide_unsuccessful" are located under
the "Performance" category
• Perfmon counters "Guided Plan Executions/sec"
and "Misguided Plan Executions/sec“ provide
similar function as the trace events above
Microsoft Confidential
| SQL Server Customer Advisory Team 56
Agenda
• SQLCAT Introduction
• Query Optimizer Architecture
• Query Plan Analysis
• Influencing Plan Selection
• Parameterized Queries
• References

| SQL Server Customer Advisory Team Page 57


Parameterization
• In SQL Server 2000 “auto parameterization” is
used only for very simple queries
– Referred to as SIMPLE parameterization in SS 2005
• Discoverable by investigating the plan;
“parameter” is suddenly there. For example

Query: select * from T where c=5


Plan: |--Table Scan(OBJECT:([tst1].[dbo].[T]),
WHERE:([T].[C]=Convert([@1])))

• Advantage: no need to recompile for the same


query with different value instead of “5”
Microsoft Confidential
| SQL Server Customer Advisory Team 58
Parameterization (contd.)
• In SQL Server 2005 “parameterization”
may be forced on a database level for
ALL queries
– Set PARAMETERIZATION option to FORCED in
ALTER DATABASE stmt
• Use only when you are sure compilation
is the bottleneck
• Some queries may get worse plans if
forcefully parameterized
• Works only against databases in 90 and
higher compatibility level
Microsoft Confidential
| SQL Server Customer Advisory Team 59
Parameterization (Example)
Query: select * from T where c=1 or c=2
Plan: |--Table Scan(OBJECT:([tt].[dbo].[T]),
WHERE:([tt].[dbo].[T].[c]=(1) OR
[tt].[dbo].[T].[c]=(2)))

ALTER DATABASE <dbname> SET


PARAMETERIZATION FORCED

Plan: |--Table Scan(OBJECT:([tt].[dbo].[T]),


WHERE:([tt].[dbo].[T].[c]=[@0] OR
[tt].[dbo].[T].[c]=[@1]))
ALTER DATABASE <dbname> SET
PARAMETERIZATION SIMPLE
Microsoft Confidential
| SQL Server Customer Advisory Team 60
Agenda
• SQLCAT Introduction
• Query Optimizer Architecture
• Query Plan Analysis
• Influencing Plan Selection
• Parameterized Queries
• References

| SQL Server Customer Advisory Team Page 61


Optimizer References
• Tips, Tricks, and Advice from the SQL
Server Query Processing Team
http://blogs.msdn.com/sqlqueryprocessi
ng/
• Query Tuning Recommendations
http://technet.microsoft.com/en-
us/library/ms188722.aspx
• Data Warehouse Query Performance
http://technet.microsoft.com/en-
us/magazine/cc434693.aspx

| SQL Server Customer Advisory Team Page 62


Some showplan references
• Find Top N costly query plans in adhoc batches or
modules...
http://blogs.msdn.com/sqltips/archive/2005/10/0
5/Top_N_costly_query_plans.aspx

• Troubleshooting Performance Problems in SQL


Server 2005
http://www.microsoft.com/technet/prodtechnol/sql
/2005/tsprfprb.mspx

• Processing XML Showplans Using SQLCLR in SQL


Server 2005
http://msdn.microsoft.com/library/default.asp?url=
/library/en-us/dnsql90/html/xmlshowplans.asp

Microsoft Confidential
63
&

| SQL Server Customer Advisory Team Page 64


© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The
information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

You might also like