Parallel Query Execution
Parallel Query Execution
in SQL Server
Craig Freedman
Software Design Engineer
SQL Server Query Team
Outline
• Why parallel query execution?
• Basics of parallel query execution
• Degree of parallelism (DOP)
• The parallelism operator
• Parallel query plan examples
• What to look out for …
• Q&A
Why Parallel Query Execution?
• Improves response time for big queries
• Completely transparent to the end user
• Appropriate for data warehouse workloads
(small numbers of big queries)
• Parallelism is not free
– Adds overhead to query execution
– May reduce overall server throughput
• Inappropriate for OLTP workloads
(large numbers of small queries)
Scalability Terminology
Data Size # CPUs Response Time
Speed Up Same Data More CPUs Reduced RT
Scale Up More Data More CPUs Same RT
Advantages:
• Excellent scalability: threads operate independently
• Easy to adjust the degree of parallelism (DOP)
Parallelism Example
Thread 1
2 8 6 2
Aggregate Evens
9 5 1 4 1 3
Thread 2
Aggregate 9 5 3 5 Odds
NOT Pipeline Parallelism
Thread 2 Thread 1
Hash
Scan
Aggregate
1 2 3 4 5 6 7 8
2 4 6 8
1 3 5 7 2 4 6 8
2 4 6 8
Parallel Query Plan Examples
Parallel Scan
• Most operators are unaware of parallelism;
Parallel scan is an exception
• Threads work together to scan entire table
• Server dynamically assigns pages or rows to threads
Parallel Join
Parallel Scan!
Inner Side Parallelism
• Normally, the inner side of a nested loops join runs serially
• Exception:
– Single outer row and no correlated parameters
– May in some rare cases yield parallelism on the inner side
of a nested loops join
Parallel Join
Parallel Scan!
Inner Side Parallelism
IF EXISTS (SELECT * FROM T1 WHERE T1.A = 0)
PRINT 'Hello World!'
One Row!
Serial Join
Create Bitmap
Create Bitmap
No Exchange!
Parallel Scan
Merging Exchange
• Needed for parallel query with ORDER BY
• Also used by merge join and stream aggregate
• Risk of intra-query parallel deadlocks!
SELECT * FROM T1 ORDER BY T1.A
Merging Exchange
Merging Exchange
• Needed for parallel query with ORDER BY
• Also used by merge join and stream aggregate
• Risk of intra-query parallel deadlocks!
SELECT * FROM T1 ORDER BY T1.A
Merging Exchange
Parallel Aggregation
• Hash rows that belong to the same group
to the same aggregate instance
Hash Partition
Local/Global (Partial) Aggregation
• If number of groups is small, data may not hash partition well
and may lead to skew problems
• Instead split into two aggregates:
– Local aggregate computes partial results for all groups
– Global aggregate computes final results
• Also eliminates exchange below the local aggregate
Merging blocks
without at least one
row from each input
No rows flow through here
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date
of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part
of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION