0% found this document useful (0 votes)

8 views13 pages

Chapter 1 Query Processing and Optimization

query processining and optimization

Uploaded by

mtaddis19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views13 pages

Chapter 1 Query Processing and Optimization

query processining and optimization

Uploaded by

mtaddis19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Advanced Database system 2019

Chapter 1
Query Processing and Optimization
Introduction
 In this chapter we shall discuss the techniques used by a DBMS to
process, Optimize and execute high level queries.
 The techniques used to split complex queries into multiple simple
operations and methods of implementing these low-level
operations.
 The query optimization techniques are used to choose an efficient
execution plan that will minimize the runtime as well as many
other types of resources such as number of disk I/O, CPU time and so
on.
What is Query Processing?
 The procedure of transforming high level SQL query into a correct
and efficient execution plan expressed in low-level language.
 When a database system receives a query for update or retrieval of
information, it goes through a series of compilation steps, called
execution plan.
 It goes through various phases.
1. First phase is called syntax checking phase: -the system
parses the query and checks that it follows the syntax rules or
not. It then matches the objects in the query syntax with the
view tables and columns listed in the system table. This phase is
divided into three: -Scanning, Parsing, Validating
A. Scanner: The scanner identifies the language tokens such as
SQL Keywords, attribute names, and relation names in the
text of the query.

AUWC, School of Technology and Informatics Page 1

Advanced Database system 2019

B. Parser: The parser checks the query syntax to determine

whether it is formulated according to the syntax rules of the
query language.
C. Validation: The query must be validated by checking that all
attributes and relation names are valid and semantically
meaningful names in the schema of the particular database
being queried.

2. In second phase the SQL query is translated in to an

algebraic expression using various rules. So that the
process of transforming a high level SQL query into a
relational algebraic form is called Query Decomposition.
The relational algebraic expression now passes to the query
optimizer.
3. In third phase Optimization is performed by substituting
equivalent expression depends on the factors such that the
existence of certain database structures, whether or not a given
file is stored, the presence of different indexes and so on. Query
optimization module work in cycle with the join manager module
to improve the order in which joins are performed.
 At this stage the cost model and several other
estimation formulas are used to rewrite the query.
 The modified query is written to utilize system resources so
as to bring the optimal performance.
 The query optimizer then generates an action plan also
called execution plan. This action plans are converted
into a query codes that are finally executed by a run time
database processor.
 Query Code Generator: It generates the code to execute
the plan.

AUWC, School of Technology and Informatics Page 2

Advanced Database system 2019

 The run time database processor estimate the cost of

each action plan and chose the optimal one for the
execution. It has the task of running the query code
whether in compiled or interpreted mode. If a runtime error
results an error message is generated by the runtime
database processor.

Figure 1: -Steps in Processing High-Level Query

What is the aim of query processing?

 To transform a query written in a high-level language, typically SQL,
into a correct and efficient execution strategy expressed in a low-level
language (implementing the relational algebra), and to execute the
strategy to retrieve the required data.
Query Analyzer

AUWC, School of Technology and Informatics Page 3

Advanced Database system 2019

 The syntax analyzer takes the query from the users, parses it
into tokens and analyses the tokens (symbols) and their order
to make sure they follow the rules of language grammar.
 If the error is found in the query submitted by the user, it is rejected
and an error code together with an explanation of why the query was
rejected is return to the user.
Query Decomposition
 In query decomposition the query processing aims are to transfer the
high-level query into a relational algebra query and to check
whether that query is syntactically or semantically correct.
Thus the query decomposition is start with a high level query and
transform into query graph of low-level operations, which satisfy the
query.
 The SQL query is decomposed into query blocks (low-level
operations), which form the basic unit. Hence nested queries within a
query are identified as separate query blocks.
 The query decomposer goes through five stages of processing for
decomposition into low-level operation and translation into algebraic
expressions.

AUWC, School of Technology and Informatics Page 4

Advanced Database system 2019

Query Analysis

 During the query analysis phase, the query is syntactically analyzed

using the programming language compiler (parser). A syntactically
legal query is then validated, using the system catalog, to ensure
that all data objects (relations and attributes) referred to by the
query are defined in the database.
 The type specification of the query qualifiers and result is also checked
at this stage.
 Example: -SELECT emp_nm FROM EMPLOYEE WHERE
emp_desg>100
 This query will be rejected because the comparison ">100" is
incompatible with the data type of emp_desg which is a
variable character string.
 At the end of query analysis phase, the high-level query (SQL) is
transformed into some internal representation that is more

AUWC, School of Technology and Informatics Page 5

Advanced Database system 2019

suitable for processing. This internal representation is typically a

kind of query tree.
 A Query Tree is a tree data structure that corresponds expression.
 A Query Tree is also called a relational algebra tree.
 Leaf node of the tree, representing the base input relations
of the query.
 Internal nodes result of applying an operation in the
algebra.
 Root of the tree representing a result of the query.
SELECT (P.proj_no, P.dept_no, E.name, E.add, E.dob)
FROM PROJECT P, DEPARTMENT D, EMPLOYEE E
WHERE P.dept_no = D.d_no AND D.mgr_id = E.emp_id AND
P.proj_loc = `Mumbai) ;

 The three relations PROJECT, DEPARTMENT, EMPLOYEE are

represent as a leaf nodes P, D and E, while the relational algebra
operations of the represented by internal tree nodes.
 Same SQL query can have many different relational algebra
expressions and hence many different query trees.
 The query parser typically generates a standard initial (canonical)
query tree.

AUWC, School of Technology and Informatics Page 6

Advanced Database system 2019

Query Normalization

 The primary phase of the normalization is to avoid redundancy. The

normalization phase converts the query into a normalized form
that can be more easily manipulated.
 In the normalization phase, a set of equivalency rules are applied so
that the projection and selection operations included on the query
are simplified to avoid redundancy.
 The projection operation corresponds to the SELECT clause of SQL
query and the selection operation correspond to the predicate
found in WHERE clause.
 The equivalency transformation rules are applied.

Semantic Analyzer

 The objective of this phase of query processing is to reduce the

number of predicates. The semantic analyzer rejects the
normalized queries that are incorrectly formulated.
 A query is incorrectly formulated if components do not contribute to
the generation of result. This happens in case of missing join
specification. A query is contradictory if its predicate cannot satisfy
by any tuple in the relation.
 The semantic analyzer examine the relational calculus query (SQL) to
make sure it contains only data objects that is table, columns, views,
indexes that are defined in the database catalog. It makes sure that
each object in the query is referenced correctly according to its data
type.
 In case of missing join specifications the components do not contribute
to the generation of the results, and thus, a query may be incorrect
formulated.

AUWC, School of Technology and Informatics Page 7

Advanced Database system 2019

Query Simplifier

 The objectives of this phase are: -

 To detect redundant qualification,
 To eliminate common sub-expressions and
 To transform sub-graph too semantically equivalent but
easier and efficiently computed form.
 Why to simplify?
 Commonly integrity constraints, view definitions and access
restrictions are introduced into the graph at this stage of analysis
so that the query must be simplified as much as possible.
 Integrity constraints defines constants which must holds for all
state of database, so any query that contradict an integrity
constraints must be avoid and can be rejected without accessing
the database.

Query Restructuring

 In the final stage of the query decomposition, the query can be

restructured to give a more efficient implementation. Transformation
rules are used to convert one relational algebra expression into an
equivalent form that is more efficient.
 The query can now be regarded as a relational algebra program,
consisting of a series of operations on relation.

Query Optimization

 The primary goal of query optimization is of choosing an efficient

execution strategy for processing a query. The query optimizer
attempts to minimize the use of certain resources (mainly the number
of I/O and CPU time) by selecting a best execution plan (access
plan).

AUWC, School of Technology and Informatics Page 8

Advanced Database system 2019

 A query optimization start during the validation phase by the system to

validate the user has appropriate privileges. Now an action plan is
generate to perform the query.

 Relational algebra query tree generated by the query simplifier module

of query decomposer.
 Estimation formulas used to determine the cardinality of the
intermediate result table.
 A cost Model.
 Statistical data from the database catalogue.
 The output of the query optimizer is the execution plan in form of
optimized relational algebra query.
 A query typically has many possible execution strategies, and the
process of choosing a suitable one for processing a query is known as
Query Optimization.

The basic issues in Query Optimization

 How to use available indexes?

AUWC, School of Technology and Informatics Page 9

Advanced Database system 2019

 How to use memory to accumulate information and perform immediate

steps such as sorting?
 How to determine the order in which joins should be performed?

Objective of query optimization

 The term query optimization does not mean giving always an optimal
(best) strategy as the execution plan. It is just a responsibly efficient
strategy for execution of the query.
 The decomposed query block of SQL is translating into an equivalent
extended relational algebra expression and then optimized.

Techniques for Query Optimization

1. The first technique is based on Heuristic Rules for ordering

the operations in a query execution strategy.
2. The second technique involves the systematic estimation
of the cost of the different execution strategies and
choosing the execution plan with the lowest cost.
3. The third technique is Semantic query optimization: - it is
used with the combination of the heuristic query
transformation rules. It uses constraints specified on the
database schema such as unique attributes and other more
complex constraints, in order to modify one query into
another query that is more efficient to execute.

Heuristic Rules

 The heuristic rules are used as an optimization technique to

modify the internal representation of query. Usually, heuristic
rules are used in the form of query tree of query graph data structure,
to improve its performance.

AUWC, School of Technology and Informatics Page 10

Advanced Database system 2019

 One of the main heuristic rules is to apply SELECT operation before

applying the JOIN or other BINARY operations. This is because the
size of the file resulting from a binary operation such as JOIN is usually
a multi-value function of the sizes of the input files.
 The main idea behind is to reduce intermediate results. This includes
performing
 SELECT operation to reduce the number of tuples &
 PROJECT operation to reduce number of attributes.
 The SELECT and PROJECT reduced the size of the file and hence,
should be applied before the JOIN or other binary operation. Heuristic
query optimizer transforms the initial (canonical) query tree into final
query tree using equivalence transformation rules. This final query
tree is efficient to execute.
Examples for query Optimization: Identify all managers who work in a
London branch
SQL:-
SELECT * FROM Staff s, Branch b WHERE s.branchNo = b.branchNo AND
s.position = ‘Manager’ AND b.city = ‘london’;
Results in these equivalent relational algebra statements
1. S(position =’Manager’) ^(city=’London’) ^(Staff.branchNo=Branch.branchNo) (Staff X Branch)
2. S(position =’Manager’) ^(city=’London’) (Staff Staff.branchNo = Branch.branchNo Branch)
3. [S(position =’Manager’)( Staff)] Staff.branchNo = Branch.branchNo [s(city=‘London’)
(Branch)]
Assume:
 1000 tuples in Staff.
 50 Managers
 50 tuples in Branch.
 5 London branches
 No indexes or sort keys
 All temporary results are written back to disk (memory is small)
 Tuples are accessed one at a time (not in blocks)
AUWC, School of Technology and Informatics Page 11
Advanced Database system 2019

Query 1 (Bad)

 Requires (1000+50) disk accesses to read from Staff and Branch

relations
 Creates temporary relation of Cartesian Product (1000*50) tuples
 Requires (1000*50) disk access to read in temporary relation and
test predicate
Total Work = (1000+50) + 2*(1000*50) = 101,050 I/O
operations

Query 2 (Better)

 Again requires (1000+50) disk accesses to read from Staff and

Branch
 Joins Staff and Branch on branchNo with 1000 tuples (1
employee : 1 branch )
 Requires (1000) disk access to read in joined relation and check
predicate
Total Work = (1000+50) + 2*(1000) = 3050 I/O operations
3300% Improvement over Query 1
Query 3 (Best)

 Read Staff relation to determine ‘Managers’ (1000 reads)

 Create 50 tuple relation(50 writes)
 Read Branch relation to determine ‘London’ branches (50 reads)
 Create 5 tuple relation(5 writes)
 Join reduced relations and check predicate (50 + 5 reads)

AUWC, School of Technology and Informatics Page 12

Advanced Database system 2019

Total Work = 1000 + 2*(50) + 5 + (50 + 5) = 1160 I/O

operations
8700% Improvement over Query 1
Summary of Heuristics for Algebraic Optimization:

1. The main heuristic is to apply first the operations that reduce the size
of intermediate results.
2. Perform select operations as early as possible to reduce the number of
tuples and perform project operations as early as possible to reduce
the number of attributes. (This is done by moving select and project
operations as far down the tree as possible.)
3. The select and join operations that are most restrictive should be
executed before other similar operations. (This is done by reordering
the leaf nodes of the tree among themselves and adjusting the rest of
the tree appropriately.)

AUWC, School of Technology and Informatics Page 13

Sample Question Paper Assistant Director IT Cadre 20240729
No ratings yet
Sample Question Paper Assistant Director IT Cadre 20240729
17 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
Chapter 1 Query Processing
No ratings yet
Chapter 1 Query Processing
26 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
31 pages
Ch-2 Query Processing and Optimization
No ratings yet
Ch-2 Query Processing and Optimization
21 pages
Uds24201j Unit III
No ratings yet
Uds24201j Unit III
34 pages
CO3-Notes-Query Processing and Optimization
No ratings yet
CO3-Notes-Query Processing and Optimization
5 pages
Chapter 2
No ratings yet
Chapter 2
47 pages
Query Optimization
No ratings yet
Query Optimization
103 pages
CH 02
No ratings yet
CH 02
127 pages
Chapter -2-Query Prosessing and Optimization
No ratings yet
Chapter -2-Query Prosessing and Optimization
44 pages
Chapter 1 - Query Processing and Optimization
No ratings yet
Chapter 1 - Query Processing and Optimization
62 pages
Ch-2 Query Processing and Optimization
No ratings yet
Ch-2 Query Processing and Optimization
26 pages
Chapter Two Query Processing
No ratings yet
Chapter Two Query Processing
60 pages
Unit 6
No ratings yet
Unit 6
34 pages
Chapter 2 Querry Proccessing
No ratings yet
Chapter 2 Querry Proccessing
7 pages
ADBMS Chapter One
No ratings yet
ADBMS Chapter One
21 pages
Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
129 pages
ADBS - Chapter Two
No ratings yet
ADBS - Chapter Two
41 pages
Chapter 2 - Query Processing and Optimization
100% (1)
Chapter 2 - Query Processing and Optimization
28 pages
CH - 2 Query Process
No ratings yet
CH - 2 Query Process
44 pages
QUERY Processing and Relational Algebra
No ratings yet
QUERY Processing and Relational Algebra
27 pages
ADB Notes 2021
No ratings yet
ADB Notes 2021
43 pages
Query Processing
No ratings yet
Query Processing
28 pages
CH - 1 Query Process SW
No ratings yet
CH - 1 Query Process SW
43 pages
Chapter 2 Adb
No ratings yet
Chapter 2 Adb
21 pages
Sudhansu, DBMS 3rd
No ratings yet
Sudhansu, DBMS 3rd
6 pages
Advanced Database Systems Chapter 2
100% (1)
Advanced Database Systems Chapter 2
16 pages
Introduction To Query Processing and Optimization
No ratings yet
Introduction To Query Processing and Optimization
4 pages
Query Processing
No ratings yet
Query Processing
20 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
Query Optimization: Admas University, Advanced DBMS Lecture Note
No ratings yet
Query Optimization: Admas University, Advanced DBMS Lecture Note
5 pages
04 Advanced Database System Chap 02 (RVUNC)
No ratings yet
04 Advanced Database System Chap 02 (RVUNC)
50 pages
DE Module5 QueryOptimization
No ratings yet
DE Module5 QueryOptimization
11 pages
Module - 1
No ratings yet
Module - 1
94 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
Advanced SQL Performance Tuning: Optimize Your Database Workloads
From Everand
Advanced SQL Performance Tuning: Optimize Your Database Workloads
Robert Johnson
No ratings yet
Query Processing Steps
No ratings yet
Query Processing Steps
3 pages
Query Processing Optimization
No ratings yet
Query Processing Optimization
38 pages
Advancedchapter 2 2013
No ratings yet
Advancedchapter 2 2013
16 pages
1 Intro Select Project
No ratings yet
1 Intro Select Project
28 pages
Query Processing
0% (1)
Query Processing
15 pages
Adb ch2
No ratings yet
Adb ch2
72 pages
2 Algorithms For Query Processing Optimization
No ratings yet
2 Algorithms For Query Processing Optimization
46 pages
29-Query Optimization-04-10-2024
No ratings yet
29-Query Optimization-04-10-2024
35 pages
Query Processing
No ratings yet
Query Processing
4 pages
Chapter One1
No ratings yet
Chapter One1
21 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
9 pages
Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
108 pages
ch2 PDF
No ratings yet
ch2 PDF
72 pages
Itm661 Lecture03 Part2 2015
No ratings yet
Itm661 Lecture03 Part2 2015
47 pages
What Is Query: Lecture's Name: Amanj Anwar Abdullah
No ratings yet
What Is Query: Lecture's Name: Amanj Anwar Abdullah
6 pages
Query Processing
No ratings yet
Query Processing
3 pages
Advanced Database Chapter Two Query Processing and Optimization
100% (1)
Advanced Database Chapter Two Query Processing and Optimization
43 pages
Ivunit Query Processing
No ratings yet
Ivunit Query Processing
12 pages
Unit 3
No ratings yet
Unit 3
63 pages
ADBChapter 1
No ratings yet
ADBChapter 1
32 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
28 pages
Chapter 2-Query Processing and Optimi
No ratings yet
Chapter 2-Query Processing and Optimi
43 pages
DB Lecture 5 (SQL)
No ratings yet
DB Lecture 5 (SQL)
63 pages
Machine Learning
No ratings yet
Machine Learning
22 pages
Chapter 3-Processes2
No ratings yet
Chapter 3-Processes2
33 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
37 pages
Java Cat 1
No ratings yet
Java Cat 1
8 pages
18ES62 Module 5 Notes
No ratings yet
18ES62 Module 5 Notes
27 pages
15.software Test Automation
No ratings yet
15.software Test Automation
33 pages
Desire C For Embedded Sofie Beerens
No ratings yet
Desire C For Embedded Sofie Beerens
248 pages
MODIS Level 2 Corrected Reflectance Science Processing Algorithm (CREFL - SPA) User's Guide
No ratings yet
MODIS Level 2 Corrected Reflectance Science Processing Algorithm (CREFL - SPA) User's Guide
10 pages
Computer Programming Chapter Three Handout
No ratings yet
Computer Programming Chapter Three Handout
10 pages
Coa 3 Practicals
No ratings yet
Coa 3 Practicals
9 pages
Xfaas: Hyperscale and Low Cost Serverless Functions at Meta
No ratings yet
Xfaas: Hyperscale and Low Cost Serverless Functions at Meta
16 pages
Draw Syntax Tree For The Expression A
No ratings yet
Draw Syntax Tree For The Expression A
2 pages
Rohini 87815623574
No ratings yet
Rohini 87815623574
4 pages
Chapter No.8 Past Paper MCQS, S and L Questions (2nd Year)
No ratings yet
Chapter No.8 Past Paper MCQS, S and L Questions (2nd Year)
6 pages
LMIC-v4 1 0
No ratings yet
LMIC-v4 1 0
27 pages
Lyallpur Grammar School Faisalabad T. Marks Obtained Q#1. 20 Q#2. 16 Q#3. 30 Q#4. 34 Total: 100
No ratings yet
Lyallpur Grammar School Faisalabad T. Marks Obtained Q#1. 20 Q#2. 16 Q#3. 30 Q#4. 34 Total: 100
11 pages
Unit 1
No ratings yet
Unit 1
38 pages
BCA NEW July 2024
No ratings yet
BCA NEW July 2024
72 pages
Fundamental of Operating System
No ratings yet
Fundamental of Operating System
26 pages
Dynamic Apex
No ratings yet
Dynamic Apex
5 pages
Modern CPU Architecture
No ratings yet
Modern CPU Architecture
18 pages
A Logic For Secure Stratified Systems and Its Application To Containerized Systems
No ratings yet
A Logic For Secure Stratified Systems and Its Application To Containerized Systems
8 pages
Vibe Coding Vs Agentic Coding Fundamentals and Pra
No ratings yet
Vibe Coding Vs Agentic Coding Fundamentals and Pra
35 pages
Basic Structure of Computers, Instructions &programs: Digital Design and Computer Organization BCS302)
No ratings yet
Basic Structure of Computers, Instructions &programs: Digital Design and Computer Organization BCS302)
31 pages
Difference Between JDK JRE and JVM
No ratings yet
Difference Between JDK JRE and JVM
10 pages
2.process and Threds
No ratings yet
2.process and Threds
48 pages
Flow
No ratings yet
Flow
22 pages
Module 6 - Code Generation
No ratings yet
Module 6 - Code Generation
36 pages
Zenon Straton
No ratings yet
Zenon Straton
84 pages
Introduction To Computer 2
No ratings yet
Introduction To Computer 2
22 pages
Azure
No ratings yet
Azure
146 pages
Chapter No. 01
No ratings yet
Chapter No. 01
45 pages

Chapter 1 Query Processing and Optimization

Uploaded by

Chapter 1 Query Processing and Optimization

Uploaded by

Advanced Database system 2019

AUWC, School of Technology and Informatics Page 1

B. Parser: The parser checks the query syntax to determine

2. In second phase the SQL query is translated in to an

AUWC, School of Technology and Informatics Page 2

 The run time database processor estimate the cost of

Figure 1: -Steps in Processing High-Level Query

What is the aim of query processing?

AUWC, School of Technology and Informatics Page 3

AUWC, School of Technology and Informatics Page 4

 During the query analysis phase, the query is syntactically analyzed

AUWC, School of Technology and Informatics Page 5

suitable for processing. This internal representation is typically a

 The three relations PROJECT, DEPARTMENT, EMPLOYEE are

AUWC, School of Technology and Informatics Page 6

 The primary phase of the normalization is to avoid redundancy. The

 The objective of this phase of query processing is to reduce the

AUWC, School of Technology and Informatics Page 7

 The objectives of this phase are: -

 In the final stage of the query decomposition, the query can be

 The primary goal of query optimization is of choosing an efficient

AUWC, School of Technology and Informatics Page 8

 A query optimization start during the validation phase by the system to

 Relational algebra query tree generated by the query simplifier module

The basic issues in Query Optimization

 How to use available indexes?

AUWC, School of Technology and Informatics Page 9

 How to use memory to accumulate information and perform immediate

Objective of query optimization

Techniques for Query Optimization

1. The first technique is based on Heuristic Rules for ordering

 The heuristic rules are used as an optimization technique to

AUWC, School of Technology and Informatics Page 10

 One of the main heuristic rules is to apply SELECT operation before

 Requires (1000+50) disk accesses to read from Staff and Branch

 Again requires (1000+50) disk accesses to read from Staff and

 Read Staff relation to determine ‘Managers’ (1000 reads)

AUWC, School of Technology and Informatics Page 12

Total Work = 1000 + 2*(50) + 5 + (50 + 5) = 1160 I/O

AUWC, School of Technology and Informatics Page 13

You might also like