0% found this document useful (0 votes)
29 views9 pages

What Is Query Processing?

Uploaded by

getachew worku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views9 pages

What Is Query Processing?

Uploaded by

getachew worku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

1/8/2021

Chapter One: Query Processing and Optimization


What is Query Processing?

Friday, January 8,

Friday, January 8, 2021


• Outline Steps required to transform high level SQL query into a
 Translating SQL Queries into Relational
correct and “efficient” strategy for execution and retrieval.

ADBMS 2021
Algebra
 Basic Algorithms for Executing Query The aim of query processing is to find information in one or

ADBMS
Operations more databases and deliver it to the user quickly and
 Using Heuristic in Query Optimization efficiently.
 Using Selectivity and Cost Estimates in Query • Query Processing can be divided into four main phases:
Optimization • Decomposition:
 Semantic Query Optimization • Optimization
• Code generation, and
1 2
• Execution

Query Processing? Phases of Query Processing


Query Decomposition
Friday, January 8, 2021

Friday, January 8, 2021


high level user query  Query decomposition is the process of transforming a high
level query into a relational algebra query, and to check that
the query is syntactically and semantically correct.
query Typical stages in query decomposition are:
ADBMS

ADBMS

Processor  Analysis :- detect and reject “incorrect” queries


 Normalization :-convert the query into a normalized
form. The predicate WHERE will be converted to
Conjunctive () or Disjunctive () Normal form.
 Lexical and syntactic analysis normalization
• check validity (similar to compilers)
low level data manipulation • check for attributes and relations
commands • type checking on the qualification
3 4

1
1/8/2021

Query processing …
Query processing…
• Query processing: Execute transactions in behalf of this query
and print the result. Steps in query processing:

Friday, January 8, 2021

Friday, January 8, 2021


 Semantic Analysis :- : to reject normalized queries hat
are not correctly formulated or contradictory. Incorrect if
components do not contribute to generate result.

ADBMS

ADBMS
Simplification :- to detect redundant qualifications,

Query Restructuring :- Re arranging nodes so that the


most restrictive condition will be executed first.

5 6

Relational Algebra
Cont.…
• Example: Select Customer name From Customer, Invoice
• Domain: set of relations Where region = ‘Kansas City and Amount > 1000
Friday, January 8, 2021

Friday, January 8, 2021


• Basic operators: select, project, union, set
difference, Cartesian product
• Derived operators: set intersection, division, join
ADBMS

ADBMS

• Procedural: Relational expression specifies query


by describing an algorithm (the sequence in which
operators are applied) for determining the result of
an expression

• Algebra a language based on operators and a domain of values

7 8

2
1/8/2021

Query processing…
• Query Optimization :- are one of the main means by which Cont.…
• Approaches to Query Optimization
modern database systems achieve their performance

Friday, January 8, 2021

Friday, January 8, 2021


• Heuristics Approach :- uses the knowledge of the
advantages.
characteristics of the relational algebra operations and the
• Given a request for data manipulation or retrieval, an
relationship between the operators to optimize the query.
optimizer will choose an optimal plan for evaluating the

ADBMS

ADBMS
• Thus the heuristic approach of optimization will make
request from among the manifold alternative strategies. i.e. use of:
there are many ways (access paths) for accessing desired • Properties of individual operators
• Association between operators
file/record.
• Query Tree: a graphical representation of the operators,
• The optimizer tries to select the most efficient (cheapest) relations, attributes and predicates and processing sequence
during query processing.
access path for accessing the data 9 10

Cont. Cont.
Query tree is composed of three main parts: • The properties of each operations and the association between
Friday, January 8, 2021

Friday, January 8, 2021


operators is analyzed using set of rules called
• The Leafs: the base relations used for processing the
query/ extracting the required information TRANSFORMATION RULES.

• The Root: the final result/relation as an out put based on


ADBMS

ADBMS

the operation on the relations used for query processing


• Nodes: intermediate results or relations before reaching
the final result.
• Sequence of execution of operation in a query tree will
start from the leaves and continues to the intermediate
nodes and ends at the root. 11 12

3
1/8/2021

Cont. Cont.

Friday, January 8, 2021

Friday, January 8, 2021


ADBMS

ADBMS
13 14

Cont. Cont.
• Query tree:
 Process for heuristics optimization – A tree data structure that corresponds to a relational algebra
Friday, January 8, 2021

Friday, January 8, 2021


1. The parser of a high-level query generates an initial internal expression.
representation;
– It represents the input relations of the query as leaf nodes of
the tree, and represents the relational algebra operations as
2. Apply heuristics rules to optimize the internal representation. internal nodes.
ADBMS

ADBMS

3. A query execution plan is generated to execute groups of • An execution of the query tree consists of executing an internal
operations based on the access paths available on the files node operation whenever its operands are available and then
replacing that internal node by the relation that results from
involved in the query. executing the operation.
 The main heuristic is to apply first the operations that reduce the • Query graph:
size of intermediate results. – A graph data structure that corresponds to a relational
calculus expression.
• E.g. Apply SELECT and PROJECT operations before applying
15 – It does not indicate an order on which operations to perform 16
the JOIN or other binary operations.
first.
– There is only a single graph corresponding to each query.

4
1/8/2021

Cont. Cont.
 Example:
SELECT ENAME,RESP

Friday, January 8, 2021

Friday, January 8, 2021


SELECT ENAME
FROM EMP, ASG, PROJ
FROM EMP,ASG
WHERE EMP.ENO = ASG.ENO
WHERE EMP.ENO = ASG.ENO
AND DUR > 37 AND ASG.PNO = PROJ.PNO

ADBMS

ADBMS
AND PNAME = "CAD/CAM"
AND DUR ≥ 36
AND TITLE = "Programmer"

17 18

Restructuring Equivalent query


 Convert relational calculus to
Friday, January 8, 2021

Friday, January 8, 2021


relational algebra
 Make use of query trees
SELECT ENAME
FROM EMP, ASG, PROJ
ADBMS

ADBMS

WHERE EMP.ENO = ASG.ENO


AND ASG.PNO = PROJ.PNO
AND ENAME ≠ “J. Doe”
AND PNAME = “CAD/CAM”
AND (DUR = 12 OR DUR = 24)

19 20

5
1/8/2021

Cont. Cont.

Friday, January 8, 2021


Friday, January 8, 2021
Example 2:SELECT LNAME
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE PNAME = ‘AQUARIUS’ AND
PNMUBER=PNO AND ESSN=SSN
AND BDATE > ‘1957-12-31’;

ADBMS
ADBMS
Constrict query tree ?

21 22

Summary of Heuristics for Algebraic Optimization:


Cont.
1. The main heuristic is to apply first the operations that reduce the

Friday, January 8, 2021


Friday, January 8, 2021

size of intermediate results.

2. Perform select operations as early as possible to reduce the


number of tuples and perform project operations as early as
possible to reduce the number of attributes. (This is done by
ADBMS
ADBMS

moving select and project operations as far down the tree as


possible.)

3. The select and join operations that are most restrictive should be
executed before other similar operations. (This is done by
reordering the leaf nodes of the tree among themselves and
23 adjusting the rest of the tree appropriately.) 24

6
1/8/2021

Using Selectivity and Cost Estimates in Query Optimization Cost Estimation Approach to Query Optimization
• The main idea is to minimize the cost of processing a query.
• Cost-based query optimization:
The cost function is comprised of:

Friday, January 8, 2021

Friday, January 8, 2021


• Estimate and compare the costs of executing a query using
• I/O cost + CPU processing cost + communication cost +
different execution strategies and choose the strategy with the
Storage cost
lowest cost estimate. (Compare to heuristic query optimization)
• These components might have different weights in different
• Issues
processing environments
• Cost function
• The DBMs will use information stored in the system catalogue

ADBMS

ADBMS
• Number of execution strategies to be considered for the purpose of estimating cost.
• Cost Components for Query Execution • The main target of query optimization is to minimize the size
1. Access cost to secondary storage of the intermediate relation. The size will have effect in the
2. Storage cost cost of:
3. Computation cost • Disk Access
4. Memory usage cost • Data Transportation
5. Communication cost • Storage space in the Primary Memory
25 26
• Writing on Disk

Semantic Query Optimization Basic Algorithms for Executing Query Operations


• ALGORITHMS FOR SELECT AND JOIN OPERATIONS
Friday, January 8, 2021

Friday, January 8, 2021


• Semantic Query Optimization: • Examples:
• Uses constraints specified on the database schema in order to
modify one query into another query that is more efficient to
execute.
• Consider the following SQL query,
SELECT E.LNAME, M.LNAME FROM EMPLOYEE E M
ADBMS

ADBMS

WHERE E.SUPERSSN=M.SSN AND


E.SALARY>M.SALARY
• Explanation:
• Suppose that we had a constraint on the database schema that
stated that no employee can earn more than his or her direct
supervisor.
• If the semantic query optimizer checks for the existence of this
constraint, it need not execute the query at all because it knows
that the result of the query will be empty. Techniques known as
theorem proving can be used for this purpose. 27 28

7
1/8/2021

Cont.

Friday, January 8, 2021

Friday, January 8, 2021


• Search Methods for Simple Selection:

– S1 Linear search (brute force):
• Retrieve every record in the file, and test whether its attribute values
satisfy the selection condition.

ADBMS

ADBMS
– S2 Binary search:
• If the selection condition involves an equality comparison on a key
attribute on which the file is ordered, binary search (which is more
efficient than linear search) can be used
– S3 Using a primary index to retrieve a single record:
• If the selection condition involves an equality comparison on a key
attribute with a primary index use the primary index to retrieve the 29 30
record.
Friday, January 8, 2021

Friday, January 8, 2021


ADBMS

ADBMS

31 32

8
ADBMS Friday, January 8, 2021

33

ADBMS Friday, January 8, 2021


34
1/8/2021

You might also like