0% found this document useful (0 votes)

4 views33 pages

DBMS Unit - 7

Uploaded by

Rudram Kshatri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views33 pages

DBMS Unit - 7

Uploaded by

Rudram Kshatri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 33

Unit 7- Query Processing & Optimization

Subject Code: 303105203

Prof. S.W.Thakare
Assistant Professor,
Computer science & Engineering
Topics
• Query processing
• Steps in query processing
• Measures of query cost
• Selection operation
• Evaluation of expressions
• Query optimization
• Transformation of relational expressions
• Cost base optimization approach
Query Processing
• Query Processing is process to convert high level queries to low level so
machine can understand and perform the action that are requested by
the user.

• It is used to extract data from database and to fetch data it takes three
steps:

1. Parsing and Translation

2. Optimization
3. Evaluation
Steps in Query Processing

Parser checks the syntax of Translator translates the

query and verifies attribute query into its internal
name and relation name form (relational algebra)

SQL Parser and Relational algebra

Query translator expression

Choose best execution plan

Optimizer
Execute the query-evaluation
plan and returns output
Evaluation
Query Result Execution plan
engine

Database Catalog
Data Data Data Statistics about Data
Step in Query Processing
1. Parsing and Translation:

• SQL is suitable for humans.

• Relational Algebra is suitable for system.
• First step in query processing is to convert SQL to Relational Algebra
Expression.

• Parsing(Parser):
• Check Syntax
• Check Schema Elements
• Translation(Translator)
• Parse Tree Relational Algebra
Step in Query Processing
2. Optimization(Optimizer):

• Selects the best Query Evaluation Plan to evaluate the query.

• Generate Query Evaluation Plan for all possible option

• Query Evaluation plan = Query Tree + Algorithms

Step in Query Processing
3. Evaluation Engine:

• Evaluates the Query Plan (selected by Optimizer) and fetches the

data from database.
Measures of Query cost
• Total time taken by statement/query to execute and to fetch data from
database is Query Cost.

• Some factors are:

• Communication cost:
• Applicable to distributed/parallel system.

• CPU Cycles:
• Difficult to calculate
• CPU speed improves at much faster rate as compared to Disk
speed
Measures of Query cost
• Disk Access:
• Dominates the total time to execute a query

• Disk Access Cost:

• No. of seeks
• No. of blocks Read
• No. of Blocks Write

Note: Generally cost of writing is greater than cost of reading.

Selection operation
• File Scan: Search algorithm that used to locate and retrieve data that
satisfy a selection condition in a file.

• Symbol for Selection operator: σ (Sigma)

• Syntax: σ condition (Relation)

• Searching algorithm for Selection Operation:

1. Linear Search(A1)
2. Binary Search(A2)
Selection operation
1. Linear Search(A1):
• This algorithm will search and scan all blocks available and tests all
records/data to determine whether or not they satisfy the selection
condition.
• Cost(A1) = BR (worst case)
where BR denotes number of blocks
• If the condition is on a Key(primary) attribute, then system can stop
searching if desired record found.
• Cost(A1) = BR/2 (best case)
• If the condition is on non (primary) key attribute, then multiple
blocks may contain desired records, then the price of scanning such
blocks have to be added to the estimate value.
• This is slower than Binary Search.
Selection operation
2. Binary Search(A2):
• File (relation) ordered based on attribute A (primary index).
• Cost(A2) = log2(BR)

• This is faster than Linear Search.

Evaluation of expressions
• Query(Expression) may contain
multiple operations and due to ΠName
that solving query (Expression)
will be difficult.
• To evaluate such type of Query

Bottom to top
we have to solve one by one in

Execution
proper order.
• There are two methods to
evaluate multiple operations (Customer)
expression: σBalance<25000
1. Materialization
2. Pipelining
(Account)
Materialization
• Materialization starts the bottom of the expression and
performs a
single operation at a time.
• Materialized(store in temporary relation) each intermediate result of all
operations performed and use this result as input to evaluate next-level
operations.
• The cost of materialization can be quite high as overall cost can be
compute as:
Overall Cost = Sum of Costs of individual
operations + Cost of writing intermediate results to
the disk
• Disadvantages of Materializations are:
• Due to intermediate results, it creates lots of temporary relations.
Pipelining
• In Pipelining, the output of one operation is passed as input to another
operation. i.e. it forms a queue.
• As the output of one operation is passed to the next operation
in the Pipelines, the number of intermediate temporary relations
will be
reduced.
• Performing operations in Pipeline eliminates the cost of writing and
reading temporary relations.

• It can be executed in two ways:

• Demand Driven(Lazy Evaluation)
• Producer Driven(Eager Pipelining)
Pipelining(Cont..)
• Demand Driven(Lazy Evaluation):
System repeatedly requests for tuples from operation at the top of
pipeline.

• Producer Driven(Eager Pipelining):

Operations do not wait for request to produce tuples, but generate
the tuples eagerly.
Query Optimization
• Query optimization is the process of choosing the best evaluation plan
having lowest cost from the available multiple plans.
• For Example, Customer CID ANO Name
CSE1 A1 Jay
(Account) (Customer))
ΠName ( σ Balance<25000 CSE2 A2 Abhi
CSE3 A3 Parth
Efficient CSE4 A4 Pratik
2 records 4 records
plan
Account ANO Balance

Customer) ) A1 30000
ΠName ( σBalance<25000 (Account A2 10000
A3 20000
4 records 4 records A4 40000
Query Optimization Approaches
• Cost Based Optimization (Exhaustive Search Optimization):
• In this, it initially generates all possible plans and then select the
best plan from it.
• Its provides the best solution.

• Heuristic Based Optimization:

• These technique is less expensive.
• To decide optimized query execution plan there are some heuristic
rules:
1. To reduce the number of tuples, Perform selection as early as possible.
2. To reduce the number of attributes, Perform projection as early as possible.
3. Perform most restrictive selection and join operations (i.e. with smallest result
size) before other similar operations.
Transformation of relational expressions
• Two relational algebra expressions are said to be equivalent if the two
expressions generate the same set of tuples on every legal database
instance.

• An equivalence rule says that expressions of two forms are equivalent.

• Can replace expression of first by second, or vice versa.
Transformation of relational expressions(Conti..)
• For Example, Customer Account
CID ANO Name ANO Balance
CSE1 A1 Jay A1 30000
CSE2 A2 Abhi A2 10000
CSE3 A3 Parth A3 20000
CSE4 A4 Pratik A4 40000

(Customer) ) Customer) )
ΠName ( σBalance<25000 (Account) ΠName ( σBalance<25000 (Account
Customer
Name
Meet
Jay
Equivalence Rules
1. Conjunctive(Combined) selection operations can be deconstructed
into sequence of individual selections. This is known as Cascade of σ.

Customer
CID ANO Name Balance
σANO<3 Λ Balance<20000 (Customer) Output
CS1 1 Jay 30000
CID ANO Name Balance
CS2 2 Abhi 10000 OUTPUT CSE2 2 Abhi 10000
CS3 3 Parth 20000
σANO<3 (σBalance<20000 (Customer))
CS4 4 Pratik 40000

= σθ1 (σθ2 (E))

σθ1Λθ2 (E)
Equivalence Rules
2. Selection operations are commutative
σθ1 (σθ2 (E)) = σθ2 (σθ1 (E))

2. If many projection used in expression then only the last in a sequence

of projection operations is required. So Omit all other projection
operation.
ΠL1 (ΠL2 (…(Π Ln (E))…)) = ΠL1 (E)

2. Selection operation can be combined with Cartesian products and

theta joins.
σθ (E1 E2)) = θE2
E1
σθ1 (E1 θ2 E2)) = E1
θ1Λ θ2 E2
Equivalence Rules
5. Theta-join operations (and natural joins) are commutative.

E1 σθ E2
= E2 σθ E1

5. Natural join operations are associative

(E1 (E2 E3)

E2)

E3
Equivalence Rules
7. The selection operation distributes over the theta join
operation
under the following two conditions:

(a)When all the attributes in θ0 involve only the attributes of one

of the expressions (E1) being joined.
σθ0(E1 θ E2) = (σθ0(E1))
θ E2

(b) When θ1 involves only the attributes of E1 and θ2 involves only

the attributes of E2.
σθ1𝖠θ2 (E1 θ E2) = (σθ1(E1)) θ (σθ2 (E2))
Equivalence Rules
8. The projection operation distributes over the theta join operation as

(a) if θ involves only attributes from L1 𝖴 L2:

follows:

∏ L1 𝖴 L2 (E1 θ E2 ) = (∏ L (E1 ))
1 θ (∏ L (E2 ))
2

(b) Consider a join E1 θ E2.

• Let L1 and L2 be sets of attributes from E1 and E2, respectively.

L1 𝖴 L2, and
• Let L3 be attributes of E1 that are involved in join condition θ, but are not in

L1 𝖴 L2.
• Let L4 be attributes of E2 that are involved in join condition θ, but are not in

∏ L 𝖴 L (E1 E2 ) = ∏ L L ((∏ L 𝖴 L (E )) ∏ 𝖴 L (E2 )))

𝖴
1 2 θ 1 2
1 1 θ 3
L 2 4

(
Equivalence Rules
9. The set operations union and intersection are commutative

E1 𝖴 E2 = E2 𝖴
E1 E1 ∩ E2
= E2 ∩ E1
Note: set difference is not
commutative

10. Set union and intersection are

associative.

(E1 𝖴 E2) 𝖴 E3 = E1 𝖴 (E2 𝖴 E3)

Equivalence Rules
11. The selection operation distributes over 𝖴, ∩ and –.

and similarly for 𝖴 and ∩ in place of –

σθ (E1 – E2) = σθ (E1) – σθ(E2)

Also: σθ (E1 – E2) = σθ(E1) – E2

but not for 𝖴

and similarly for ∩ in place of –,

12. The projection operation distributes over union

ΠL(E1 𝖴 E2) = (ΠL(E1)) 𝖴 (ΠL(E2))

Cost Based Optimization Approach
Query optimization is the process of choosing the most efficient or the
most
favourable type of executing an SQL statement.
It is an art of science for applying rules to rewrite the tree of operators that
is invoked in a query and to produce an optimal plan.
A plan is said to be optimal if it returns the answer in the least time or by using
the
least space.

Features of the cost-based optimization-

 The cost-based optimization is based on the cost of the query that to
be optimized.
 The query can use a lot of paths based on the value of indexes,
available sorting
Cost Based Optimization Approach
 The aim of query optimization is to choose the most efficient path of
implementing the query at the possible lowest minimum cost in the form of
an algorithm.
 The cost of executing the algorithm needs to be provided by the query
Optimizer so that the most suitable query can be selected for an operation.
The cost of an algorithm also depends upon the cardinality of the input
Cost Based Optimization Approach
 Cost Estimation:
To estimate the cost of different available execution plans or the execution
strategies the query tree is viewed and studied as a data structure that
contains a series of basic operation which are linked in order to perform
the query.
 The cost of the operations that are present in the query depends on the
way in which the operation is selected such that, the proportion of select
operation that forms the output.
 It is also important to know the expected cardinality of an operation
output.
 The cardinality of the output is very important because it forms the input
to
the next operation.
Cost Based Optimization Approach
The cost of optimization of the query depends upon the following-
1. Cardinality-
Cardinality is known to be the number of rows that are returned by performing
the operations specified by the query execution plan. The estimates of the
cardinality must be correct as it highly affects all the possibilities of the
execution plan.
2. Selectivity-
Selectivity refers to the number of rows that are selected. The selectivity of any
row from the table or any table from the database almost depends upon the
condition.
The satisfaction of the condition takes us to the selectivity of that specific row.
3. Cost-
Cost refers to the amount of money spent on the system to optimize the system.
The
Cost Based Optimization Approach
Cost Components Of Query Execution:
The following are the cost components of the execution of a query-
1. Access cost to secondary storage-
This can be the cost of searching, reading, or writing data blocks that originally
found on the secondary storage, especially on the disk. The cost of searching
for records in a file also depends upon the type of access structure that file has.
2. Memory usage cost-
The cost of memory usage can be calculated simply by using the number of
memory buffers that are needed for the execution of the query.
3. Storage cost-
The storage cost is the cost of storing any intermediate files(files that are the
result of processing the input but are not exactly the result) that are generated by
the execution strategy for the query.
Cost Based Optimization Approach
4. Computational cost-
This is the cost of performing the memory operations that are available on
the record within the data buffers. Operations like searching for records,
merging records, or sorting records. This can also be called the CPU cost.

5. Communication cost-
This is the cost that is associated with sending or communicating the query and
its results from one place to another. It also includes the cost of transferring the
table and results to the various sites during the process of query evaluation.

Query Processing and Optimization
No ratings yet
Query Processing and Optimization
42 pages
Chapter 6 - Query Processing and Optimization Algorithm
No ratings yet
Chapter 6 - Query Processing and Optimization Algorithm
27 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
Lesson 07
No ratings yet
Lesson 07
57 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
DBMS Unit - 7
No ratings yet
DBMS Unit - 7
34 pages
SAP S - 4HANA Migration Cockpit - Migrate Your Data To SAP S - 4HANA
No ratings yet
SAP S - 4HANA Migration Cockpit - Migrate Your Data To SAP S - 4HANA
64 pages
Unit 5 Query Processing Detail
No ratings yet
Unit 5 Query Processing Detail
38 pages
CH 13 Updated
No ratings yet
CH 13 Updated
30 pages
CH 14 Updated
No ratings yet
CH 14 Updated
30 pages
Chapter 1 Query Processing
No ratings yet
Chapter 1 Query Processing
58 pages
DBMS - Unit-5 (Darshan) (VisionPapers - In)
No ratings yet
DBMS - Unit-5 (Darshan) (VisionPapers - In)
42 pages
Lecture 15
No ratings yet
Lecture 15
30 pages
28-Query Processing-30-09-2024
No ratings yet
28-Query Processing-30-09-2024
17 pages
Module - 4
No ratings yet
Module - 4
60 pages
DBMS Unit5 Lecture1
No ratings yet
DBMS Unit5 Lecture1
22 pages
Lecture 06
No ratings yet
Lecture 06
41 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
42 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
45 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Query Processing 1
No ratings yet
Query Processing 1
13 pages
KD Query Processing1
No ratings yet
KD Query Processing1
32 pages
Query Processing and Query Optimization
No ratings yet
Query Processing and Query Optimization
9 pages
4 Chapter Four
No ratings yet
4 Chapter Four
34 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
64 pages
Advanced Database System Chapter Two Query Processing and Optimization
No ratings yet
Advanced Database System Chapter Two Query Processing and Optimization
50 pages
DBMS - Unit 3 1
No ratings yet
DBMS - Unit 3 1
17 pages
Advanced Database
No ratings yet
Advanced Database
47 pages
Chapter 2 Query Processing and Optimization (Autosaved)
No ratings yet
Chapter 2 Query Processing and Optimization (Autosaved)
35 pages
Servicenow: Servicenow Certified System Administrator
No ratings yet
Servicenow: Servicenow Certified System Administrator
135 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
1.6 PPT - Query Optimization
No ratings yet
1.6 PPT - Query Optimization
53 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
DE Module5 QueryOptimization
No ratings yet
DE Module5 QueryOptimization
11 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Lecture 20+Query+Processing+ +opt
No ratings yet
Lecture 20+Query+Processing+ +opt
22 pages
Chapter 2 Query Processing
No ratings yet
Chapter 2 Query Processing
21 pages
ADB Slides 4
No ratings yet
ADB Slides 4
47 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
24 pages
ADB Chapter 2
No ratings yet
ADB Chapter 2
40 pages
Query Trees and Heuristics For Query Optimization
No ratings yet
Query Trees and Heuristics For Query Optimization
29 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
Ch-2 Query Processing and Optimization
No ratings yet
Ch-2 Query Processing and Optimization
26 pages
Steps in Syniti ADMM For Development
No ratings yet
Steps in Syniti ADMM For Development
11 pages
Unit 6
No ratings yet
Unit 6
34 pages
Rdbms Assignment
No ratings yet
Rdbms Assignment
12 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
63 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
42 pages
AMSAL
No ratings yet
AMSAL
58 pages
ADBMS Notes
67% (3)
ADBMS Notes
48 pages
Chapter 2 - Query Processing and Optimization
100% (1)
Chapter 2 - Query Processing and Optimization
28 pages
Query Processing
No ratings yet
Query Processing
28 pages
Unit-5 Query Processing and Optimization
No ratings yet
Unit-5 Query Processing and Optimization
40 pages
Oracle Recovery Manager (RMAN)
No ratings yet
Oracle Recovery Manager (RMAN)
10 pages
CH 1 Query Processing
No ratings yet
CH 1 Query Processing
38 pages
Ivunit Query Processing
No ratings yet
Ivunit Query Processing
12 pages
Ad Bms Notes
No ratings yet
Ad Bms Notes
44 pages
Duckdb-Docs-0 9 2
No ratings yet
Duckdb-Docs-0 9 2
897 pages
Blockchain Unit 1
No ratings yet
Blockchain Unit 1
13 pages
Online Car Project
No ratings yet
Online Car Project
41 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
2 Chapter 3 Query Optimization
No ratings yet
2 Chapter 3 Query Optimization
29 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
28 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Name: Abhijit Biswas: Profile Summary
No ratings yet
Name: Abhijit Biswas: Profile Summary
3 pages
Cloud Computing MCQ All Unit
No ratings yet
Cloud Computing MCQ All Unit
25 pages
LT - Full Stack Java
No ratings yet
LT - Full Stack Java
53 pages
Module Pool Tutorial
50% (2)
Module Pool Tutorial
70 pages
Unit 6
No ratings yet
Unit 6
54 pages
5TH Semester Question Bank Iot Based System
No ratings yet
5TH Semester Question Bank Iot Based System
34 pages
PL-600 Microsoft Practice Questions Available
No ratings yet
PL-600 Microsoft Practice Questions Available
22 pages
Chapter-1 New
No ratings yet
Chapter-1 New
35 pages
61bdbf4846c3f - Database Management System 2078
No ratings yet
61bdbf4846c3f - Database Management System 2078
9 pages
AnswerKeys M1M2M3M4 PDF
No ratings yet
AnswerKeys M1M2M3M4 PDF
10 pages
COIS70735 COIS71208 DMS Assignment 2 2022
No ratings yet
COIS70735 COIS71208 DMS Assignment 2 2022
10 pages
3.1 Introduction To NoSQL
No ratings yet
3.1 Introduction To NoSQL
10 pages
UnderstandingEthereumviaGraphAnalysis Toit
No ratings yet
UnderstandingEthereumviaGraphAnalysis Toit
32 pages
Group 4
No ratings yet
Group 4
10 pages
Unit 9 PL & SQL Language
No ratings yet
Unit 9 PL & SQL Language
38 pages
Bi Assignment 1
No ratings yet
Bi Assignment 1
14 pages
674ac96d31782 Thevigilantes Glbitm
No ratings yet
674ac96d31782 Thevigilantes Glbitm
20 pages
Zos Admin Laterals New
No ratings yet
Zos Admin Laterals New
6 pages
Assignment 6
No ratings yet
Assignment 6
2 pages
81 PDF
No ratings yet
81 PDF
24 pages
Lec03 SQL Joins
No ratings yet
Lec03 SQL Joins
50 pages
Assignment 5
No ratings yet
Assignment 5
1 page
Ma'Lumotlar Bazasi
No ratings yet
Ma'Lumotlar Bazasi
13 pages
Relational Algebra Maybe - SQL
No ratings yet
Relational Algebra Maybe - SQL
38 pages
DMSMP
No ratings yet
DMSMP
20 pages
Design - and - Implementation - of - Simple - Interactive - e - (1) SYSTEM
No ratings yet
Design - and - Implementation - of - Simple - Interactive - e - (1) SYSTEM
5 pages
ADF Code Corner: 103. How-To Edit An ADF Form With Data Dragged From An ADF Faces Table
No ratings yet
ADF Code Corner: 103. How-To Edit An ADF Form With Data Dragged From An ADF Faces Table
10 pages
Workshop04 PRJ321 Tran PDF
No ratings yet
Workshop04 PRJ321 Tran PDF
16 pages
BioCyc Database Collection
No ratings yet
BioCyc Database Collection
3 pages

DBMS Unit - 7

Uploaded by

DBMS Unit - 7

Uploaded by

Unit 7- Query Processing & Optimization

Subject Code: 303105203

1. Parsing and Translation

Parser checks the syntax of Translator translates the

SQL Parser and Relational algebra

Choose best execution plan

• SQL is suitable for humans.

• Selects the best Query Evaluation Plan to evaluate the query.

• Generate Query Evaluation Plan for all possible option

• Query Evaluation plan = Query Tree + Algorithms

• Evaluates the Query Plan (selected by Optimizer) and fetches the

• Some factors are:

• Disk Access Cost:

Note: Generally cost of writing is greater than cost of reading.

• Symbol for Selection operator: σ (Sigma)

• Syntax: σ condition (Relation)

• Searching algorithm for Selection Operation:

• This is faster than Linear Search.

• It can be executed in two ways:

• Producer Driven(Eager Pipelining):

• Heuristic Based Optimization:

• An equivalence rule says that expressions of two forms are equivalent.

= σθ1 (σθ2 (E))

2. If many projection used in expression then only the last in a sequence

2. Selection operation can be combined with Cartesian products and

5. Natural join operations are associative

(E1 (E2 E3)

(a)When all the attributes in θ0 involve only the attributes of one

(b) When θ1 involves only the attributes of E1 and θ2 involves only

(a) if θ involves only attributes from L1 𝖴 L2:

(b) Consider a join E1 θ E2.

∏ L 𝖴 L (E1 E2 ) = ∏ L L ((∏ L 𝖴 L (E )) ∏ 𝖴 L (E2 )))

10. Set union and intersection are

(E1 𝖴 E2) 𝖴 E3 = E1 𝖴 (E2 𝖴 E3)

and similarly for 𝖴 and ∩ in place of –

Also: σθ (E1 – E2) = σθ(E1) – E2

but not for 𝖴

12. The projection operation distributes over union

ΠL(E1 𝖴 E2) = (ΠL(E1)) 𝖴 (ΠL(E2))

Features of the cost-based optimization-

You might also like