0% found this document useful (0 votes)
20 views

Lecture 18

This document provides an overview of the steps involved in processing a database query. It describes the logical and physical query plans, including parsing, optimization, and execution. It also discusses topics like pipelined execution and intermediate result materialization.

Uploaded by

Hai Kim Sreng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Lecture 18

This document provides an overview of the steps involved in processing a database query. It describes the logical and physical query plans, including parsing, optimization, and execution. It also discusses topics like pipelined execution and intermediate result materialization.

Uploaded by

Hai Kim Sreng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Introduction to Database Systems

CSE 444
Lecture 18: Query Processing Overview

CSE 444 - Spring 2009


Where We Are
• We are learning how a DBMS executes a query
– How come a DBMS can execute a query so fast?

• Lecture 15-16: Data storage, indexing, physical tuning


• Lecture 17: Relational algebra (we will finish it today)
• Lecture 18: Overview of query processing steps
– Includes a description of how queries are executed
• Lecture 19: Operator algorithms
• Lecture 20: Overview of query optimization

CSE 444 - Spring 2009 2


Outline for Today
• Steps involved in processing a query
– Logical query plan
– Physical query plan
– Query execution overview

• Readings: Section 15.1 of the book


– Query processing steps
– Query execution using the iterator model
– An introduction to next lecture on operator algos
CSE 444 - Spring 2009 3
Query Evaluation Steps
SQL query

Parse & Rewrite Query

Logical
Select Logical Plan
Query plan
optimization
Select Physical Plan
Physical
plan
Query Execution

Disk 4
Example Database Schema
Supplier(sno,sname,scity,sstate)
Part(pno,pname,psize,pcolor)
Supply(sno,pno,price)

View: Suppliers in Seattle


CREATE VIEW NearbySupp AS
SELECT sno, sname
FROM Supplier
WHERE scity='Seattle' AND sstate='WA'

CSE 444 - Spring 2009 5


Example Query
Find the names of all suppliers in Seattle
who supply part number 2

SELECT sname FROM NearbySupp


WHERE sno IN ( SELECT sno
FROM Supplies
WHERE pno = 2 )

CSE 444 - Spring 2009 6


Steps in Query Evaluation
• Step 0: Admission control
– User connects to the db with username, password
– User sends query in text format
• Step 1: Query parsing
– Parses query into an internal format
– Performs various checks using catalog
• Correctness, authorization, integrity constraints
• Step 2: Query rewrite
– View rewriting, flattening, etc.

CSE 444 - Spring 2009 7


Rewritten Version of Our Query
Original query:
SELECT sname
FROM NearbySupp
WHERE sno IN ( SELECT sno
FROM Supplies
WHERE pno = 2 )

Rewritten query:
SELECT S.sname
FROM Supplier S, Supplies U
WHERE S.scity='Seattle' AND S.sstate='WA’
AND S.sno = U.sno
AND U.pno = 2;
CSE 444 - Spring 2009 8
Continue with Query Evaluation
• Step 3: Query optimization
– Find an efficient query plan for executing the query
– We will spend a whole lecture on this topic
• A query plan is
– Logical query plan: an extended relational algebra tree
– Physical query plan: with additional annotations at each
node
• Access method to use for each relation
• Implementation to use for each relational operator

CSE 444 - Spring 2009 9


Extended Algebra Operators
• Union ∪, intersection ∩, difference -
• Selection σ
• Projection π
• Join
• Duplicate elimination δ
• Grouping and aggregation γ
• Sorting τ
• Rename ρ
CSE 444 - Spring 2009 10
Logical Query Plan
π sname

σ sscity=‘Seattle’ ∧sstate=‘WA’ ∧ pno=2

sno = sno

Suppliers Supplies

CSE 444 - Spring 2009 11


Query Block
• Most optimizers operate on individual query
blocks
• A query block is an SQL query with no nesting
– Exactly one
• SELECT clause
• FROM clause
– At most one
• WHERE clause
• GROUP BY clause
• HAVING clause
CSE 444 - Spring 2009 12
Typical Plan for Block (1/2)
...

π fields

σ selection condition

SELECT-PROJECT-JOIN
join condition
Query

join condition …

R S
CSE 444 - Spring 2009 13
Typical Plan For Block (2/2)
havingcondition

γ fields, sum/count/min/max(fields)

π fields

σ selection condition

join condition

… …
CSE 444 - Spring 2009 14
How about Subqueries?
SELECT Q.name
FROM Person Q
WHERE Q.age > 25
and not exists
SELECT *
FROM Purchase P
WHERE P.buyer = Q.name
and P.price > 100

CSE 444 - Spring 2009 15


How about Subqueries?
SELECT Q.name -
FROM Person Q
WHERE Q.age > 25
name
and not exists name

SELECT *
FROM Purchase P σ
WHERE P.buyer = Q.name Price > 100

and P.price > 100 σ


age>25
buyer=name

Person Purchase Person


CSE 444 - Spring 2009 16
Physical Query Plan
• Logical query plan with extra annotations

• Access path selection for each relation


– Use a file scan or use an index

• Implementation choice for each operator

• Scheduling decisions for operators


CSE 444 - Spring 2009 17
Physical Query Plan
(On the fly) π sname

(On the fly) σ sscity=‘Seattle’ ∧sstate=‘WA’ ∧ pno=2

(Nested loop)
sno = sno

Suppliers Supplies
(File scan) (File scan)
CSE 444 - Spring 2009 18
Final Step in Query Processing
• Step 4: Query execution
– How to synchronize operators?
– How to pass data between operators?

• Approach:
– Iterator interface with

– Pipelined execution or
– Intermediate result materialization

CSE 444 - Spring 2009 19


Iterator Interface
• Each operator implements iterator interface
• Interface has only three methods
• open()
– Initializes operator state
– Sets parameters such as selection condition
• get_next()
– Operator invokes get_next() recursively on its inputs
– Performs processing and produces an output tuple
• close(): cleans-up state
CSE 444 - Spring 2009 20
Pipelined Execution
• Applies parent operator to tuples directly as
they are produced by child operators
• Benefits
– No operator synchronization issues
– Saves cost of writing intermediate data to disk
– Saves cost of reading intermediate data from disk
– Good resource utilizations on single processor
• This approach is used whenever possible

CSE 444 - Spring 2009 21


Pipelined Execution
(On the fly) π sname

(On the fly) σ sscity=‘Seattle’ ∧sstate=‘WA’ ∧ pno=2

(Nested loop)
sno = sno

Suppliers Supplies
(File scan) (File scan)
CSE 444 - Spring 2009 22
Intermediate Tuple
Materialization
• Writes the results of an operator to an
intermediate table on disk

• No direct benefit but


• Necessary for some operator implementations
• When operator needs to examine the same
tuples multiple times

CSE 444 - Spring 2009 23


Intermediate Tuple Materialization

(On the fly) π sname

(Sort-merge join)
sno = sno

(Scan: write to T1) (Scan: write to T2)


σ sscity=‘Seattle’ ∧sstate=‘WA’ σ pno=2

Suppliers Supplies
(File scan) (File scan)
CSE 444 - Spring 2009 24
Next Time
• Algorithms for physical op. implementations

• How to find a good query plan?

CSE 444 - Spring 2009 25

You might also like