SQLDEV320A WEEK10-1
SQLDEV320A WEEK10-1
•Parse SELECT statement into logical units: keywords, expressions, operators, and
identifiers.
•Build query tree describing the logical steps needed to transform the source data into
the
• format required by the result set.
•The query optimizer analyzes different ways the source tables can be accessed. It then
selects the series of steps that returns the results fastest while using fewer resources. The
query tree is updated to record this exact series of steps. The final, optimized version of the
query tree is called the execution plan.
•The relational engine starts executing the execution plan. As the steps that require data from
the base tables are processed, the relational engine requests that the storage engine pass
up data from the rowsets requested from the relational engine.
•The relational engine processes the data returned from the storage engine into the format
• defined for the result set and returns the result set to the client.
Query Query compilation is the process of choosing a
good enough execution plan that the Optimizer
Compilatio has to act in the short amount of time
Parse a query into a tree representation
n Normalize and validate the query
Evaluate possible query plans
Pick a good enough plan, based on cost
Query execution is the process of
executing the plan that is created
during query compilation and
Query optimization
Not necessarily performed directly
Executio after query compilation
May trigger a query recompilation
n Compilation versus recompilation
Query recompiles may occur because
of correctness-related reasons or
plan optimality-related reasons
Query Plans and Execution
Contexts
Parameter A =
?
Parameter B = Query Plan
?
User = ?
Parameter A = Parameter A = Parameter A =
12 100 11
Parameter B = Parameter B = Parameter B =
‘xy’ ‘ftr’ ‘sd’
User = Jorge User = Nabil User = Walter
Execution Context Execution Context Execution Context
1 2 3
Compilation and Execution
Overview Plan in
YES
Cache?
YES
Any outdated Update statistics
statistics? YES
one by one Compare recompilation Wait for memory grant
(optionally thresholds with table scheduler to OK
NO
asynchronously) cardinalities or modification request
counters
Optimize SQL
Statements Open (activate) plan
NO
Based on comparison,
Generate the Execution Plan. any outdated statistics?
Save recompilation thresholds YES Run plan to completion
of all referenced tables in the
query with the Execution Plan
Normalize tree
Sources of inefficiency:
• Bad cardinality estimation?
• Look at plan
• Parameter-sensitive plans?
• Dynamic un-parameterized SQL
Server?
• Bad physical database design?
• Missing indexes?
Second Stage – Optimization
Overview
The RT is a mechanism used by SQL Server to determine if
a table has changed enough to force a recompile of a
query plan to determine if a more efficient plan is
available for the current data distribution
Recompilati
on The threshold crossing test is performed to decide
Threshold
whether to recompile a query plan:
| colmodctr(current) – colmodctr(snapshot) |
>= RT
(RT)
If there are no statistics, or nothing is interesting, then
table cardinality is used:
| cardinality(current) – cardinality(snapshot) |
>= RT
Recompilation Threshold
Calculation
Query
Worktables are internal tables that are used to hold
intermediate results.
Lifecycle:
• For example, if an ORDER BY clause references columns that are not
covered by any indexes, the relational engine may need to generate
a worktable to sort the result set into the order requested.
Statistics
performance.
& Activity
For most queries, the query optimizer
already generates the necessary statistics
for a high quality query plan; in a few
cases, you need to create additional
statistics or modify the query design for
monitorin
best results.
g:
about the distribution of values in one or
more columns of a table or indexed view.
• The query optimizer uses these statistics to estimate
the
Statistics
creates statistics on individual columns
in the query predicate, as necessary, to improve cardinality
estimates for the query plan.
monitoring out-of-date and then updates them when they are used by a
query.
INCREMENTAL
: When ON, the statistics created are per partition statistics.
Setting statistics monitoring
Monitoring
ALTER DATABASE { database_name | CURRENT }
SET
{
AUTO_CREATE_STATISTICS { OFF | ON [ ( INCREMENTAL =
{ ON | OFF } ) ] }
| AUTO_UPDATE_STATISTICS { ON | OFF }
| AUTO_UPDATE_STATISTICS_ASYNC { ON | OFF }
}
Contents of a Query Plan
Operator
Requires memory
Characteristi grant?
cs
Order preserving?
Graphical Showplan Flow
Outer Table
5 3 1
2
Inner table
Resultset 1 and 2 are joined using a nested loops join, creating resultset 3
Resultsets 3 and 4 are joined using a hash match join, creating
resultset
Resultsets5 5 and 6 are joined using a nested loops join, creating a
resultset for the Select clause
Query plans are trees. Therefore, any join
Traversin
branch can be as substantial as an entire,
separate query
ga
Examine major sub-branches first by
looking top-down at the outermost joins
Pseudo-code:
for each row R1 in the outer table
begin
for each row R2 in the inner table
if R1 joins with R2
return (R1, R2)
if R1 did not join
return (R1, NULL)
end
Nested Loops Join
SELECT *
FROM Production.WorkOrder
INNER JOIN Production.WorkOrderRouting
ON Production.WorkOrder.WorkOrderID =
Production.WorkOrderRouting.WorkOrderID
WHERE Production.WorkOrderRouting.ModifiedDate =
CAST('2005-08-01' AS DATETIME);
Merge Join
Pseudo-code:
for each row R1 in the outer table
begin
for each row R2 in the inner table
if R1 joins with R2
return (R1, R2)
if R1 did not join
return (R1, NULL)
end
Merge Join
SELECT *
FROM Production.WorkOrder
INNER JOIN Production.WorkOrderRouting
ON Production.WorkOrder.WorkOrderID =
Production.WorkOrderRouting.WorkOrderID
WHERE Production.WorkOrderRouting.ModifiedDate >
CAST('2005-08-01' AS DATETIME);
Hash Join
Pseudo-code:
for each row R1 in the outer table
begin
for each row R2 in the inner table
if R1 joins with R2
return (R1, R2)
if R1 did not join
return (R1, NULL)
end
Hash Join
SELECT FirstName, LastName, EmailAddress
FROM Person.Person
INNER JOIN Person.EmailAddress ON
Person.EmailAddress.BusinessEntityID =
Person.Person.BusinessEntityID
WHERE Person.EmailAddress.ModifiedDate > CAST('2005-08-01'
AS DATETIME);
There are three types of hash joins:
• In-memory
• Build phase the hash table fits completely in
memory
• Grace
Hash Join • Build phase hash table does not fit completely in
memory and spills to the disk (worktable in
tempdb)
• Recursive
• Build phase hash table is very large and have to
use many levels of merge joins and hash
partitioning
The term hash bailout is sometimes
used to describe grace hash joins or
recursive hash joins
Seek Operators
•Index Seek
Index Spool
Sort
Stream Aggregation
Other Considerations
• Eliminate Spools
• Rewind and Rebinds
Join Optimization
1 2 3 4 5 6
Limit the Limit the Create Join on mostly Join on Avoid using
number of number of Indexes on Unique columns with SELECT *
Joins rows to be Join Columns Columns the same data
joined type
Join Optimization (continued)
Avoid negative
This introduces additional contention, because it often results in
logic, such as !=,
evaluation of each row (index scans)
<>, NOT (…)
LIKE operator
leading wildcards
If you must use LIKE, make the first character a literal
almost always
causes a table scan
Stored Procedure Optimization
Use SET
NOCOUNT ON Beware of
Return only
This still increments widely
Always the columns
@@ROWCOUNT varying
validate required in a
function parameter
Prevents sending the parameters Select (avoid
inputs, which
DONE_IN_PROC early in the the
can lead to
message for each code construction
statement in an stored parameter
Select *)
procedure sniffing issues
User Defined Function (UDF) Optimization
These harmless Options to replace
constructs can be truly UDFs include:
detrimental to Considering inline expressions
performance, because for simple functions
they are called for Considering derived tables if
every row of the result possible
set
NOTHING
!