Chapter 1 Query processing and Optimization
Chapter 1 Query processing and Optimization
Chapter 1
Query Processing and Optimization
Introduction
In this chapter we shall discuss the techniques used by a DBMS to
process, Optimize and execute high level queries.
The techniques used to split complex queries into multiple simple
operations and methods of implementing these low-level
operations.
The query optimization techniques are used to choose an efficient
execution plan that will minimize the runtime as well as many
other types of resources such as number of disk I/O, CPU time and so
on.
What is Query Processing?
The procedure of transforming high level SQL query into a correct
and efficient execution plan expressed in low-level language.
When a database system receives a query for update or retrieval of
information, it goes through a series of compilation steps, called
execution plan.
It goes through various phases.
1. First phase is called syntax checking phase: -the system
parses the query and checks that it follows the syntax rules or
not. It then matches the objects in the query syntax with the
view tables and columns listed in the system table. This phase is
divided into three: -Scanning, Parsing, Validating
A. Scanner: The scanner identifies the language tokens such as
SQL Keywords, attribute names, and relation names in the
text of the query.
The syntax analyzer takes the query from the users, parses it
into tokens and analyses the tokens (symbols) and their order
to make sure they follow the rules of language grammar.
If the error is found in the query submitted by the user, it is rejected
and an error code together with an explanation of why the query was
rejected is return to the user.
Query Decomposition
In query decomposition the query processing aims are to transfer the
high-level query into a relational algebra query and to check
whether that query is syntactically or semantically correct.
Thus the query decomposition is start with a high level query and
transform into query graph of low-level operations, which satisfy the
query.
The SQL query is decomposed into query blocks (low-level
operations), which form the basic unit. Hence nested queries within a
query are identified as separate query blocks.
The query decomposer goes through five stages of processing for
decomposition into low-level operation and translation into algebraic
expressions.
Query Analysis
Query Normalization
Semantic Analyzer
Query Simplifier
Query Restructuring
Query Optimization
The term query optimization does not mean giving always an optimal
(best) strategy as the execution plan. It is just a responsibly efficient
strategy for execution of the query.
The decomposed query block of SQL is translating into an equivalent
extended relational algebra expression and then optimized.
Heuristic Rules
Query 1 (Bad)
Query 2 (Better)
1. The main heuristic is to apply first the operations that reduce the size
of intermediate results.
2. Perform select operations as early as possible to reduce the number of
tuples and perform project operations as early as possible to reduce
the number of attributes. (This is done by moving select and project
operations as far down the tree as possible.)
3. The select and join operations that are most restrictive should be
executed before other similar operations. (This is done by reordering
the leaf nodes of the tree among themselves and adjusting the rest of
the tree appropriately.)