0% found this document useful (0 votes)

8 views

Query Processing

Uploaded by

rahatshahid105

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Query Processing

Uploaded by

rahatshahid105

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 19

Query Processing

Database System Concepts - 7th Edition 15.1 ©Silberschatz, Korth and Sudarshan
Query Processing

Silberschatz, Korth, Sudarshan,

Database System Concepts,
7° edition, 2011

Database System Concepts - 7th Edition 15.2 ©Silberschatz, Korth and Sudarshan
Basic Steps in Query Processing
1. Parsing and translation We mainly focus on
the optimization phase
2. Optimization
3. Evaluation

Database System Concepts - 7th Edition 15.3 ©Silberschatz, Korth and Sudarshan
Basic Steps in Query Processing (cont.)
 Parser and translator
 Translate the (SQL) query into relational algebra
 Parser checks syntax (e.g., correct relation and operator names)

 Evaluation engine
 The query-execution engine takes a query-evaluation plan, executes
that plan, and returns the answers to the query

 Optimizer (in a nutshell – more details in the next slides)

 Chooses the most efficient implementation to execute the query
 Produces equivalent relational algebra expressions
 Annotates them with instructions (algorithms): query execution plan (QEP)
 Estimates the cost of each equivalent QEP, according to a given cost model
 Choose the “best” QEP

Database System Concepts - 7th Edition 15.4 ©Silberschatz, Korth and Sudarshan
Basic Steps: Optimization
 1st level of optimization: an SQL query has many equivalent relational
algebra expressions
 salary75000(salary(instructor)) and
salary(salary75000(instructor)) are equivalent
 They both correspond to SELECT salary
FROM instructor
WHERE salary < 75000

 2nd level of optimization: a relational algebra operation can be evaluated

using one of several different algorithms
 e.g., block nested-loop join VS. merge-join; file scan VS. index scan

 Input of optimization: a query in the form of an algebra expression

 Output of optimization: the “best” annotated relational algebra expression

specifying detailed evaluation strategy (query evaluation plan or query
execution plan – QEP) answering the input query

Database System Concepts - 7th Edition 15.5 ©Silberschatz, Korth and Sudarshan
Basic Steps: Optimization (Cont.)
 Different query evaluation plans have different costs
 User is not expected to specify least-cost plans

 Query Optimization: amongst all equivalent QEP choose the one

with lowest cost
 Cost is estimated using statistical information from the database catalog
 # of tuples in relations, tuple sizes, # of distinct values for a given attribute, etc.

 We study… (Chapter 15⋆ – evaluation of QEP)

 How to measure query costs (establish a cost model)
 Algorithms for evaluating relational algebra operations and their cost
 How to combine algorithms for individual operations in order to evaluate a
complex expression (QEP)
 … and (Chapter 16⋆ – choosing the best QEP)
 How to optimize queries, that is, how to find a QEP with lowest estimated
cost

⋆
Silberschatz, Korth, and Sudarshan, Database System Concepts, 7° ed.

Database System Concepts - 7th Edition 15.6 ©Silberschatz, Korth and Sudarshan
How to measure query costs
(cost model)

These slides are a modified version of the slides provided with the book:
(however, chapter numeration refers to 7 th Ed.)

Database System Concepts, 6th Ed.

©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

The original version of the slides is available at: https://www.db-book.com/

Measures of Query Cost
Response time (wall-clock time needed to execute a plan) depends on several factors
 system configuration
 amount of dedicated buffer in RAM (aka, memory, main memory)
 whether or not indices are (partially) stored permanently in the buffer
 runtime conditions
 amount of free buffer at the time the plan is executed
 content of the buffer at the time the plan is executed
 parameters, embedded in queries, which are resolved at runtime only
SELECT salary
FROM instructor
WHERE salary < $a
where $a is a variable provided by the application (user)

Thus
1. cost models (like ours) focus on resource consumption rather than response time
(optimizers minimize resource consumption rather than response time)
2. different optimizers may make different assumptions (parameters): every theoretical
analysis must be recast with the actual parameters used by the concrete system
(optimizer) to which the analysis is going to be applied
Database System Concepts - 7th Edition 15.8 ©Silberschatz, Korth and Sudarshan
Measures of Query Cost (Cont.)
 Query cost (total elapsed time for answering a query) is measured in terms of
different resources
 disk access (I/O operation on disk)
 CPU usage
 (network communication for distributed DBMS – later in this course)
 Typically disk access is the predominant cost, and is also relatively easy to
estimate. Measured by taking into account
 Number of seeks (number of random I/O accesses)
 Number of blocks read
 Number of blocks written
 It is generally assumed cost for writing to be twice as the cost for reading
(data is read back after being written to ensure the write was successful)
VERY IMPORTANT!!!
- “disk” refers to permanent drive for file storage, hard-disk, secondary memory, permanent memory
- “memory” refers to volatile drive for data storage, RAM, main memory, buffer
These are all used as synonims

This is a so far accepted choice for measuring query costs (cost model).
New technologies: faster hard-disks (solid-state drives – SSD) and cheaper (thus bigger) RAM
might direct towards different cost models (e.g., based also on CPU usage or RAM I/O operations)
Database System Concepts - 7th Edition 15.9 ©Silberschatz, Korth and Sudarshan
Measures of Query Cost (Cont.)
 We ignore difference between writing and reading: we just consider
 tS – time for one seek
 tT – time to transfer one block
 Example: cost for b block transfers plus S seeks
b * tT + S * t S
 Values of tT and tS must be calibrated for the specific disk system
 Typical values (2018): tS = 4 ms, tT = 0.1 ms
 Some DBMS performs, during installation, seeks and block transfers to
estimate average values
 We ignore CPU costs for simplicity
 Real systems usually do take CPU cost into account
 We do not include cost to writing output to disk in our cost formulae

Database System Concepts - 7th Edition 15.10 ©Silberschatz, Korth and Sudarshan
Algorithms for evaluating relational
algebra operations

These slides are a modified version of the slides provided with the book:
(however, chapter numeration refers to 7 th Ed.)

Database System Concepts, 6th Ed.

©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

The original version of the slides is available at: https://www.db-book.com/

Physical organization of records

 At the physical level, records are stored (on permanent disks) in files
(managed and organized by the filesystem)
 We assume files are organized according to sequential file
organization
 i.e., a file is stored in contiguous blocks, with records ordered according
to some attribute(s) – not necessarily ordered by primary key
 Other file organization techniques exist (e.g., B+-tree file
organization), leading to different formulas for cost estimate

Database System Concepts - 7th Edition 15.12 ©Silberschatz, Korth and Sudarshan
Selection Operation
 File scan (relation scan without indices)
PROs: can be applied to any file, regardless of its ordering, availability of indices,
nature of selection operation, etc.
CONs: it is slow
 Algorithm A1 (linear search). Retrieve and scan each file block and
test all records to see whether they satisfy the selection condition
 br denotes number of blocks containing records from relation r
 Cost estimate??? (selection on a generic, non-key attribute)
 cost = br block transfers + 1 seek = tS + br * tT
We assume blocks are stored contiguously so 1 seek operation is enough (disk head
does not need to move to seek next block)

 Selection on a key attribute. Cost estimate???

 stop on finding record
 cost = (br /2) block transfers + 1 seek = tS + (br / 2)* tT

 Index scan (relation scan using an index)

 selection condition must be on search-key of index
 hi : height of the B+-tree (# of accesses to traverse the index
before accessing the data)
 A2 (primary index, equality on key). Retrieve a single record
that satisfies the corresponding equality condition. Cost?
 cost = (hi + 1) * (tT + tS)
 A3 (primary index, equality on nonkey). Retrieve multiple
records. Cost?
 Let b = number of blocks containing matching records
 Records will be on consecutive blocks
 cost = hi * (tT + tS) + tS + tT * b There is a mistake in the 6th ed.
of the book⋆ (Fig. 12.3): the “tS”
⋆ summand is omitted
Silberschatz, Korth, and Sudarshan, Database System Concepts, 6° ed.

 A4 (secondary index, equality on key). Cost?

 Equal to A2
 cost = (hi + 1) * (tT + tS)

 A4 (secondary index, equality on nonkey)

 Retrieve multiple records. Cost?
 each of n matching records may be on a different block
 Cost = (hi + n) * (tT + tS)
– Can be very expensive! Can be worse than file scan

 Can implement selections of the form 

AV (r) or A  V(r) by using
 a linear file scan,
 or by using indices in the following ways:
 A5 (primary index, comparison).
 A  V(r)
 use index to find first tuple  v and scan relation sequentially from there
 RECALL: b is the number of blocks containing matching records
 Equal to A3: Cost = hi * (tT + tS) + tS + tT * b
 AV(r)
 just scan relation sequentially till first tuple > v; do not use the index
 Similar to A1 (file scan, equality on key): Cost = tS + b* tT

 A6 (secondary index, comparison). Cost?

 For A  V(r) use index to find first index entry  v and scan index sequentially
from there, to find pointers to records.
 For AV (r) just scan leaf pages of index finding pointers to records, till first
entry > v
 In either case, retrieve records that are pointed to
 requires an I/O for each record
 Equal to A4, equality on nonkey: cost = (hi + n) * (tT + tS)
 Linear file scan may be cheaper

 A9 (conjunctive selection by intersection of identifiers)

 If there are indices with pointers to records (rather than actual records) – this is
our assumption so far anyway
 Scan indices but do not access records, just collect sets of pointers (one per
index)
 Compute the intersection, and then access records. Cost?
 Cost: cost of scanning all indices plus cost of accessing records
 Optimization: order records in the intersection and then access them in sorted
order. Advantages:
 no block is accessed twice (2 records in the same block are retrieved together)
 some seek time is saved as blocks are transferred in sorted order (disk-arm is minimized)
 A10 (disjunctive selection by union of identifiers)
 If ALL conditions can be checked through some index, then similar to A9
 Scan indices but do not access records, just collect sets of pointers (one per index)
 Compute the union, and then access records (in sorted order)
 Cost: cost of scanning all indices plus cost of accessing records
 If even only 1 condition has no associate index, then A1 (linear scan)
Database System Concepts - 7th Edition 15.19 ©Silberschatz, Korth and Sudarshan

Deep Learning Book, by Ian Goodfellow, Yoshua Bengio and Aaron Courville
No ratings yet
Deep Learning Book, by Ian Goodfellow, Yoshua Bengio and Aaron Courville
38 pages
Chapter 15: Query Processing
No ratings yet
Chapter 15: Query Processing
36 pages
Chapter 15: Query Processing
No ratings yet
Chapter 15: Query Processing
41 pages
1a Query Processing Sil 7ed Ch15 SPLIT
No ratings yet
1a Query Processing Sil 7ed Ch15 SPLIT
92 pages
CH 15
No ratings yet
CH 15
59 pages
Query Processing
No ratings yet
Query Processing
64 pages
Chapter 12 -2 (1)
No ratings yet
Chapter 12 -2 (1)
38 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
55 pages
Chapter 12: Query Processing
No ratings yet
Chapter 12: Query Processing
57 pages
Chapter 13: Query Processing: Database System Concepts, 5th Ed
No ratings yet
Chapter 13: Query Processing: Database System Concepts, 5th Ed
55 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
Unit 4_Query Processing
No ratings yet
Unit 4_Query Processing
49 pages
DBMS
No ratings yet
DBMS
24 pages
Chapter 13: Query Processing: Database System Concepts, 6 Ed
No ratings yet
Chapter 13: Query Processing: Database System Concepts, 6 Ed
21 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
127 pages
05 QueryProcessing LecW4 Feb7 22
No ratings yet
05 QueryProcessing LecW4 Feb7 22
55 pages
Database Technology Query Processing: Heiko Paulheim
No ratings yet
Database Technology Query Processing: Heiko Paulheim
60 pages
Lec 38 Query Optimization1
No ratings yet
Lec 38 Query Optimization1
41 pages
Chapter 14: Query Optimization
No ratings yet
Chapter 14: Query Optimization
69 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
28 pages
Relational Query Optimization: Warih Maharani, ST.,MT
No ratings yet
Relational Query Optimization: Warih Maharani, ST.,MT
39 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
Final DBMS Unit 7
No ratings yet
Final DBMS Unit 7
48 pages
Query Processing and Query Optimization Techniques
No ratings yet
Query Processing and Query Optimization Techniques
20 pages
Unit 1
No ratings yet
Unit 1
23 pages
Query Processing
No ratings yet
Query Processing
39 pages
L11 QueryProcessing I
No ratings yet
L11 QueryProcessing I
42 pages
DBMS-Ch14-Indexing
No ratings yet
DBMS-Ch14-Indexing
66 pages
DBMS_Unit5_Lecture1
No ratings yet
DBMS_Unit5_Lecture1
22 pages
Ch12-Query Processing
No ratings yet
Ch12-Query Processing
34 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
75 pages
CH 14
No ratings yet
CH 14
79 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
Indexing and Hashing
No ratings yet
Indexing and Hashing
76 pages
1.3 PPT - Measure of Query Cost
100% (1)
1.3 PPT - Measure of Query Cost
42 pages
QueryProcess Optim
No ratings yet
QueryProcess Optim
60 pages
Query Optimizattion
No ratings yet
Query Optimizattion
113 pages
Advanced Database Systems Lecture Notes
No ratings yet
Advanced Database Systems Lecture Notes
79 pages
UNIT 4 Query Processing and Different types of Databases
No ratings yet
UNIT 4 Query Processing and Different types of Databases
13 pages
Indexing and Hashing
No ratings yet
Indexing and Hashing
84 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
06 Query Processing (2) - NDN
No ratings yet
06 Query Processing (2) - NDN
31 pages
Query Optimization
No ratings yet
Query Optimization
9 pages
Ivunit Query Processing
No ratings yet
Ivunit Query Processing
12 pages
Tree-Based Indexing
No ratings yet
Tree-Based Indexing
50 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
Chuong2 Index
No ratings yet
Chuong2 Index
79 pages
Unit4 Part1a Indexing
No ratings yet
Unit4 Part1a Indexing
41 pages
Chapter 12
No ratings yet
Chapter 12
17 pages
370 - Lec 7
No ratings yet
370 - Lec 7
46 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
9 pages
Query Processing and Optimisation - Lecture 10 - Introduction To Databases (1007156ANR)
No ratings yet
Query Processing and Optimisation - Lecture 10 - Introduction To Databases (1007156ANR)
41 pages
Ch 13 Updated
No ratings yet
Ch 13 Updated
30 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
indexing and hashing
No ratings yet
indexing and hashing
50 pages
7 - Indexing and Hashing
No ratings yet
7 - Indexing and Hashing
51 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
Query Proc Notes
No ratings yet
Query Proc Notes
10 pages
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Techniques For Implementing Multipliers in Stratix, Stratix GX & Cyclone Devices
No ratings yet
Techniques For Implementing Multipliers in Stratix, Stratix GX & Cyclone Devices
40 pages
Data Structures
No ratings yet
Data Structures
11 pages
VTU Provisional Results Sheet
No ratings yet
VTU Provisional Results Sheet
1 page
Anthony Khong - How I Became Agodas AB Testing Police
No ratings yet
Anthony Khong - How I Became Agodas AB Testing Police
24 pages
examTimeTableReportSubjectWise-1
No ratings yet
examTimeTableReportSubjectWise-1
102 pages
Introduction To Cryptography
No ratings yet
Introduction To Cryptography
31 pages
Anurag Nayak Report
No ratings yet
Anurag Nayak Report
36 pages
High-Speed Parallel Architectures For Linear PDF
No ratings yet
High-Speed Parallel Architectures For Linear PDF
11 pages
Python Material Chapter-3-2024
No ratings yet
Python Material Chapter-3-2024
12 pages
Tied-State HMMs + Introduction To NN-based AMs
No ratings yet
Tied-State HMMs + Introduction To NN-based AMs
37 pages
Oracle OA
No ratings yet
Oracle OA
5 pages
Martingale Approach To Pricing Perpetual American Options
No ratings yet
Martingale Approach To Pricing Perpetual American Options
26 pages
Statistics Lecture Grouped Data
No ratings yet
Statistics Lecture Grouped Data
11 pages
Lec. 3 Interpolation
No ratings yet
Lec. 3 Interpolation
26 pages
Num (1 1) Den (1 1.2 1) Step (Num, Den) Grid
No ratings yet
Num (1 1) Den (1 1.2 1) Step (Num, Den) Grid
2 pages
Computer Graphics Solved MCQs (Set-1)
No ratings yet
Computer Graphics Solved MCQs (Set-1)
5 pages
NN Lec - 03
No ratings yet
NN Lec - 03
56 pages
Lab Program in C
No ratings yet
Lab Program in C
1 page
Opening The Black-Box: A Systematic Review On Explainable AI in Remote Sensing
No ratings yet
Opening The Black-Box: A Systematic Review On Explainable AI in Remote Sensing
53 pages
127-An Element Independent Corotational Procedure For The Treatment of Large Rotations
No ratings yet
127-An Element Independent Corotational Procedure For The Treatment of Large Rotations
10 pages
Stochastic Process Modified - 2
No ratings yet
Stochastic Process Modified - 2
11 pages
Critical Path Method
75% (4)
Critical Path Method
14 pages
CST-501 Simulation (Period 8) : Presented by Daw Ank Phyu Win Associate Professor
No ratings yet
CST-501 Simulation (Period 8) : Presented by Daw Ank Phyu Win Associate Professor
12 pages
GDG SOF WEEK 1 (Intro to GenAI).pptx
No ratings yet
GDG SOF WEEK 1 (Intro to GenAI).pptx
15 pages
CM
No ratings yet
CM
5 pages
A Comprehensive Study On Cryptocurrency Systems: Manan Seth
No ratings yet
A Comprehensive Study On Cryptocurrency Systems: Manan Seth
7 pages
Article 3
No ratings yet
Article 3
18 pages
Midterm Answers
No ratings yet
Midterm Answers
9 pages
Linear Regression Gradient Descent Vs Analytical Solution
No ratings yet
Linear Regression Gradient Descent Vs Analytical Solution
5 pages

Query Processing

Uploaded by

Query Processing

Uploaded by

Query Processing

Silberschatz, Korth, Sudarshan,

 Optimizer (in a nutshell – more details in the next slides)

 2nd level of optimization: a relational algebra operation can be evaluated

 Input of optimization: a query in the form of an algebra expression

 Output of optimization: the “best” annotated relational algebra expression

 Query Optimization: amongst all equivalent QEP choose the one

 We study… (Chapter 15⋆ – evaluation of QEP)

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

 Selection on a key attribute. Cost estimate???

 Index scan (relation scan using an index)

 A4 (secondary index, equality on key). Cost?

 A4 (secondary index, equality on nonkey)

 Can implement selections of the form 

 A6 (secondary index, comparison). Cost?

 A9 (conjunctive selection by intersection of identifiers)

You might also like