DBMS Unit5 Lecture1

Unit 5 of the Database Management System covers query processing and optimization, detailing the steps involved in extracting data from a database, including parsing, optimization, and evaluation. It emphasizes the importance of query cost estimation, which is based on factors such as disk accesses and CPU time, and discusses various algorithms for executing queries efficiently. The document also outlines different methods for selection operations using indices and the impact of these methods on query performance.

Uploaded by

mhjbinisha12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views22 pages

DBMS Unit5 Lecture1

Uploaded by

mhjbinisha12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Database Management System

Unit 5: Query Processing and Optimization

Lecture 1
Outline
• Query Processing
• Query Cost Estimation
• Query Operations
Query Processing
• Query processing refers to the range of activities involved in
extracting data from a database
• The activities include
• translation of queries in high-level database languages into expressions that
can be used at the physical level of the file system,
• a variety of query-optimizing transformations,
• and actual evaluation of queries
Query Processing …
• It is a step wise process that can be used at the physical level of the
file system, query optimization and actual execution of the query to
get the result
• It requires the basic concepts of relational algebra and file structure
• The actual updating and retrieval of data is performed through various “low-
level” operations.
• Examples of such operations for a relational DBMS can be relational algebra
operations such as project, join, select, Cartesian product, etc
Basic Steps in Query Processing [1]
1. Parsing and translation
2. Optimization
3. Evaluation
Basic Steps in Query Processing [2]
• Parsing and translation
• Translate the query into its internal form and then into relational algebra
• Parser checks syntax and verifies relations
• Optimization
• Amongst all equivalent evaluation plans choose the one with lowest cost
• Cost is estimated using statistical information from the database catalog, such as
the number of tuples in each relation, size of tuples, etc.
• Evaluation
• The query-execution engine takes a query-evaluation plan, executes that plan, and
returns the answers to the query
Evaluation Plans [1]
• A relational algebra expression may have many equivalent expressions
• Consider a query
select salary
from instructor
where salary < 75000
This query can be translated into either of the following relational-algebra
expressions:
• E.g., salary75000(salary(instructor)) is equivalent to

salary(salary75000(instructor))
Evaluation Plans [2]
• Each relational algebra operation can be evaluated using one of several different
algorithms
• For example, to implement the preceding selection, every tuple in instructor
can be searched to find tuples with salary less than 75000
• If a B+ tree index is available on the attribute salary, the index can be used
instead to locate the tuple
• Correspondingly, a relational-algebra expression can be evaluated in many
ways
Evaluation Plans [3]
• To specify fully how to evaluate a query, it requires both
• to specify the relational algebra expression and
• to annotate it with instructions specifying how to evaluate each operation
• Annotations may state the algorithm to be used for a specific
operation or the particular index or indices to use
• A relational-algebra operation annotated with instructions on how to
evaluate it is called an evaluation primitive
Evaluation Plans [4]
• A sequence of primitive operations that can be used
to evaluate a query is a query-execution plan or
query-evaluation plan
• Annotated expression specifying detailed evaluation
strategy
• E.g.:
• Use an index on salary to find instructors with
salary < 75000,
• Or perform complete relation scan and discard
instructors with salary  75000
Fig. A Query Evaluation plan
Basic Steps: Optimization
• Query Optimization:
• Amongst all equivalent evaluation plans, choose the one with lowest cost
• Cost is estimated using statistical information from the database catalog
• e.g. number of tuples in each relation, size of tuples, etc
• To Learn
• To measure query costs
• Algorithms for evaluating relational algebra operations
• To combine algorithms for individual operations in order to evaluate a complete expression
• To optimize queries: how to find an evaluation plan with lowest estimated cost
Measures of Query Cost
• Cost is generally measured as total elapsed time for answering query
• Many factors contribute to time cost
• disk accesses, CPU, or even network communication
• Typically disk access is the predominant cost, and is also relatively easy to
estimate.
• Measured by taking into account
• Number of seeks * average-seek-cost
• Number of blocks read * average-block-read-cost
• Number of blocks written * average-block-write-cost
• Cost to write a block is greater than cost to read a block
• data is read back after being written to ensure that the write was successful
Measures of Query Cost (Cont.)
• For simplicity we just use the number of block transfers from disk and the
number of seeks as the cost measures
• tT – time to transfer one block
• tS – time for one seek
• Cost for b block transfers plus S seeks
b * tT + S * tS
• We ignore CPU costs for simplicity
• Real systems do take CPU cost into account
• Cost to writing output to disk is not included in cost formula
Measures of Query Cost (Cont.)
• tT – time to transfer one block
• tS – time for one seek
• tS and tT depend on where data is stored;
• with 4 KB blocks:
• High end magnetic disk: tS = 4 msec and tT =0.1 msec
• SSD: : tS = 20-90 microsec and tT = 2-10 microsec for 4KB
• Costs of algorithms depend on the size of the buffer in main memory, as having
more memory reduces need for disk access
• Thus memory size should be a parameter while estimating cost; often use worst case
estimates
• The cost estimate of algorithm A is referred to as EA
Catalog Information for Cost Estimation
• nr : number of tuples in relation r.
• br : number of blocks containing tuples of r.
• sr : size of a tuple of r in bytes.
• fr : blocking factor of r — i.e., the number of tuples of r that fit into one block.
• V(A, r): number of distinct values that appear in r for attribute
• A; same as the size of A (r).
• SC(A, r): selection cardinality of attribute A of relation r; average number
of records that satisfy equality on A.
• If tuples of r are stored together physically in a file, then: br = nr / fr
Selection Operation
• File scan
• search algorithms that locate and retrieve records that fulfill a selection condition.
• Algorithm A1 (linear search)
• Scan each file block and test all records to see whether they satisfy the selection condition
• Cost estimate = br block transfers + 1 seek
Cost = br* tr + ts
• If selection is on a key attribute, can stop on finding record
• Average case, cost = (br /2) block transfers + 1 seek
Cost = (br/2)* tr + ts
• Linear search can be applied regardless of selection condition or ordering of
records in the file, or availability of indices
Selections Using Indices
• Index scan – search algorithms that use an index
• selection condition must be on search-key of index.
• A2 (primary index, equality on key). Retrieve a single record that satisfies the corresponding
equality condition
• Cost = (hi + 1) * (tT + tS)
• Where, hi denotes the height of the index. Index lookup traverses the height of the tree plus
one I/O to fetch the record
• Each of the I/O operations requires a seek and a block transfer
• A3 (primary index, equality on nonkey) Retrieve multiple records.
• Records will be on consecutive blocks
• Let b = number of blocks containing matching records
• Cost = hi * (tT + tS) + tS + tT * b
Selections Using Indices ..
• A4 (secondary index, equality on nonkey).
• Retrieve a single record if the search-key is a candidate key
• Cost = (hi + 1) * (tT + tS)
• This case is similar to primary index
• Retrieve multiple records if search-key is not a candidate key
• each of n matching records may be on a different block
• Cost = (hi + n) * (tT + tS)
• Can be very expensive!
Selections Involving Comparisons
• Can implement selections of the form AV (r) or A  V(r) by using
• a linear file scan,
• or by using indices in the following ways:
• A5 (primary index, comparison). (Relation is sorted on A)
• For A  V(r) use index to find first tuple  v and scan relation sequentially from there
• For AV (r) just scan relation sequentially till first tuple > v; do not use index
• Identical to the case of A3, equality on nonkey
• A6 (secondary index, comparison).
• For A  V(r) use index to find first index entry  v and scan index sequentially from
there, to find pointers to records.
• For AV (r) just scan leaf pages of index finding pointers to records, till first entry > v
• Identical to the case of A4, equality on nonkey
Implementation of Complex Selections
• Conjunction: 1 2. . . n(r)
• A7 (conjunctive selection using one index).
• Select a combination of i and algorithms A1 through A7 that results in the least cost for i
(r).
• Test other conditions on tuple after fetching it into memory buffer.
• A8 (conjunctive selection using composite index).
• Use appropriate composite (multiple-key) index if available.
• A9 (conjunctive selection by intersection of identifiers).
• Requires indices with record pointers.
• Use corresponding index for each condition, and take intersection of all the obtained sets of
record pointers.
• Then fetch records from file
• If some conditions do not have appropriate indices, apply test in memory.
Algorithms for Complex Selections
• Disjunction:1 2 . . . n (r).
• A10 (disjunctive selection by union of identifiers).
• Applicable if all conditions have available indices.
• Otherwise use linear scan.
• Use corresponding index for each condition, and take union of all the obtained sets of record
pointers.
• Then fetch records from file
• Negation: (r)
• Use linear scan on file
• If very few records satisfy , and an index is applicable to 
• Find satisfying records using index and fetch from file
Next

RRB NTPC CBT Stage I & II Mathematics VOLUME 1 in English
0% (1)
RRB NTPC CBT Stage I & II Mathematics VOLUME 1 in English
416 pages
Sgw-3015 Technical Manual
100% (2)
Sgw-3015 Technical Manual
7 pages
How To Create A Lead Company in T24 Within 10 Minutes
100% (1)
How To Create A Lead Company in T24 Within 10 Minutes
5 pages
Advanced Database Systems Lecture Notes
No ratings yet
Advanced Database Systems Lecture Notes
79 pages
7-Query Processing
No ratings yet
7-Query Processing
47 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
05 QueryProcessing LecW4 Feb7 22
No ratings yet
05 QueryProcessing LecW4 Feb7 22
55 pages
Query Processing in DBMS
No ratings yet
Query Processing in DBMS
22 pages
1.3 PPT - Measure of Query Cost
100% (1)
1.3 PPT - Measure of Query Cost
42 pages
JDA WMS RedPrairie Functional Course Content
No ratings yet
JDA WMS RedPrairie Functional Course Content
2 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
127 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
Introduction To Query Processing
No ratings yet
Introduction To Query Processing
21 pages
Unit IV Part II
No ratings yet
Unit IV Part II
37 pages
Lecture Notes
No ratings yet
Lecture Notes
96 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
Relational Query Optimization: Warih Maharani, ST.,MT
No ratings yet
Relational Query Optimization: Warih Maharani, ST.,MT
39 pages
QueryProcess Optim
No ratings yet
QueryProcess Optim
60 pages
CH 13 Updated
No ratings yet
CH 13 Updated
30 pages
Unit-2 Query Processing and Optimization, Query Equivalence, Join Strategies
No ratings yet
Unit-2 Query Processing and Optimization, Query Equivalence, Join Strategies
38 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
Unit 4
No ratings yet
Unit 4
24 pages
Unit 4 - Query Processing
No ratings yet
Unit 4 - Query Processing
49 pages
Query Processing
No ratings yet
Query Processing
39 pages
Final DBMS Unit 7
No ratings yet
Final DBMS Unit 7
48 pages
Dbms Chapter 5
No ratings yet
Dbms Chapter 5
54 pages
Query Evaluation
No ratings yet
Query Evaluation
51 pages
Lesson 05
No ratings yet
Lesson 05
29 pages
3 Query Processing and Optimization-1
No ratings yet
3 Query Processing and Optimization-1
18 pages
ADB Slides 4
No ratings yet
ADB Slides 4
47 pages
06 Query Processing (2) - NDN
No ratings yet
06 Query Processing (2) - NDN
31 pages
Lecture11 Query Processing
No ratings yet
Lecture11 Query Processing
37 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
Chapter 13: Query Processing: Database System Concepts, 5th Ed
No ratings yet
Chapter 13: Query Processing: Database System Concepts, 5th Ed
55 pages
DBMS R19 Unit Iv
No ratings yet
DBMS R19 Unit Iv
25 pages
Chapter 12 - 2
No ratings yet
Chapter 12 - 2
38 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
Dbms Query Evaluation
No ratings yet
Dbms Query Evaluation
28 pages
Unit 1
No ratings yet
Unit 1
23 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
28 pages
Advance Database Management System: Unit - 2 .Query Processing and Optimization
No ratings yet
Advance Database Management System: Unit - 2 .Query Processing and Optimization
38 pages
Overview of Query Evaluation: R&G Chapter 12
No ratings yet
Overview of Query Evaluation: R&G Chapter 12
30 pages
UNIT 4 Query Processing and Different Types of Databases
No ratings yet
UNIT 4 Query Processing and Different Types of Databases
13 pages
DBMS UNIT 4 Part 1
No ratings yet
DBMS UNIT 4 Part 1
15 pages
Q Evaluation
No ratings yet
Q Evaluation
17 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
9 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
55 pages
Unit 3 - DBMS
No ratings yet
Unit 3 - DBMS
15 pages
Chapter 12: Query Processing
No ratings yet
Chapter 12: Query Processing
57 pages
Ch12-Query Processing
No ratings yet
Ch12-Query Processing
34 pages
ADBMS TypicalQueryOptimizer
No ratings yet
ADBMS TypicalQueryOptimizer
30 pages
Introduction To Database Management Systems CS470
No ratings yet
Introduction To Database Management Systems CS470
11 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Database Technology Query Processing: Heiko Paulheim
No ratings yet
Database Technology Query Processing: Heiko Paulheim
60 pages
An Introduction To Perl PDF
No ratings yet
An Introduction To Perl PDF
25 pages
Query Processing and Optimisation - Intr
No ratings yet
Query Processing and Optimisation - Intr
41 pages
4.query Processing and Optimization
No ratings yet
4.query Processing and Optimization
5 pages
Stainless Steel Razni Standardi
No ratings yet
Stainless Steel Razni Standardi
6 pages
DBMS
No ratings yet
DBMS
24 pages
FICHA TÉCNICA BATERIA UCG55-12-Ultracell
No ratings yet
FICHA TÉCNICA BATERIA UCG55-12-Ultracell
2 pages
Query Processing and Optimisation - Lecture 10 - Introduction To Databases (1007156ANR)
No ratings yet
Query Processing and Optimisation - Lecture 10 - Introduction To Databases (1007156ANR)
41 pages
WFP Scope Coda
No ratings yet
WFP Scope Coda
2 pages
13 QP1
No ratings yet
13 QP1
33 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
ABB Alternator Name Plate Details
No ratings yet
ABB Alternator Name Plate Details
2 pages
DBMS IMPORTANT UNIT-4 QUESTIONS and Answer
No ratings yet
DBMS IMPORTANT UNIT-4 QUESTIONS and Answer
5 pages
Unit 3
No ratings yet
Unit 3
5 pages
Module I - 1
No ratings yet
Module I - 1
23 pages
Introduction To Query Processing and Optimization
No ratings yet
Introduction To Query Processing and Optimization
4 pages
Root Login Error
No ratings yet
Root Login Error
12 pages
Design and Analysis of Pressure Vessel
No ratings yet
Design and Analysis of Pressure Vessel
9 pages
Erpnext Documentation
No ratings yet
Erpnext Documentation
8 pages
4090-9001 Supervised IAM Installation Manual Rev E PDF
No ratings yet
4090-9001 Supervised IAM Installation Manual Rev E PDF
2 pages
Intergrating Digital Technologies and PH For COVID19
No ratings yet
Intergrating Digital Technologies and PH For COVID19
50 pages
Makalah Machine Elements
No ratings yet
Makalah Machine Elements
15 pages
Lim 2018
No ratings yet
Lim 2018
14 pages
Chapter 3
No ratings yet
Chapter 3
34 pages
Longest Palindromic Substring
No ratings yet
Longest Palindromic Substring
23 pages
Ded 0125gudang Bnpbrevisi091908 1
No ratings yet
Ded 0125gudang Bnpbrevisi091908 1
59 pages
Basket Centrifuge
No ratings yet
Basket Centrifuge
6 pages
Training Schedule Loan Performer: Tea Break
No ratings yet
Training Schedule Loan Performer: Tea Break
12 pages
Intel It Annual Performance Report 2021 2022 Paper
No ratings yet
Intel It Annual Performance Report 2021 2022 Paper
19 pages
Wireless Energy Harvesting - NCD PDF
No ratings yet
Wireless Energy Harvesting - NCD PDF
2 pages
PDF Eng
No ratings yet
PDF Eng
8 pages
Crash 2024 02 22 - 19.05.51 Client
No ratings yet
Crash 2024 02 22 - 19.05.51 Client
6 pages
Grade 3 Mental Maths Subtraction Worksheet 1 PDF 2
No ratings yet
Grade 3 Mental Maths Subtraction Worksheet 1 PDF 2
1 page
Computer SSC-I Rubrics HA (19!05!2023)
No ratings yet
Computer SSC-I Rubrics HA (19!05!2023)
4 pages
Business Analytics in Healthcare Past, Present
No ratings yet
Business Analytics in Healthcare Past, Present
1 page
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet

DBMS Unit5 Lecture1

Uploaded by

DBMS Unit5 Lecture1

Uploaded by

Database Management System

Unit 5: Query Processing and Optimization

You might also like