0% found this document useful (0 votes)
56 views27 pages

QUERY Processing and Relational Algebra

Query processing involves three steps: 1) generating a query tree from the SQL query, 2) generating multiple query execution plans, and 3) generating query plan code for evaluation. The query optimizer selects the most efficient plan by estimating the cost of executing each plan based on factors like disk accesses and CPU time. Relational algebra and calculus form the basis for relational query languages like SQL and allow for optimization of queries through alternative representations and execution strategies.

Uploaded by

EDWIN RIOBA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views27 pages

QUERY Processing and Relational Algebra

Query processing involves three steps: 1) generating a query tree from the SQL query, 2) generating multiple query execution plans, and 3) generating query plan code for evaluation. The query optimizer selects the most efficient plan by estimating the cost of executing each plan based on factors like disk accesses and CPU time. Relational algebra and calculus form the basis for relational query languages like SQL and allow for optimization of queries through alternative representations and execution strategies.

Uploaded by

EDWIN RIOBA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

QUERY PROCESSING AND

QUERY OPTIMIZATION
What is Query Processing?

 It is a 3 step process that transforms a high


level query (sql) into an equivalent and more
efficient lower-level query (of relational
algebra).
Query

Query

 Query is the statement written by the user in high language using SQL.
Parser & Translator

Parser
Query &
Translato
r

 Parser: Checks the syntax and verifies the relation.

 Translator: Translates the query into an equivalent


relational algebra.

Example:
SQL> select name from customer;

RA:=∏name(customer)
Relational Algebra

Parser Relational
Query & Algebra
Translato
r

 It is the query converted in algebraic form from pl/ sql by translator.

 Example:
SQL>SELECT ENAME FROM EMP,ASG
WHERE EMP.ENO=ASG.ENO AND
DUR>37;

RA:1) ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO(EMP × ASG))


2) ΠENAME(EMP ENO (σDUR>37(ASG)))
Optimizer

Parser Relational
Query & Algebra
Translato
r

 It will select the query which has low cost.


Optimizer

Example:
1)ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO(EMP × ASG))
2) ΠENAME(EMP ENO (σDUR>37(ASG)))

Optimizer will select Expression2 as it avoids the expensive


and large intermediate Cartesian product, and therefore
typically is better.
Comparison of two relational queries
 ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO  ΠENAME(EMP ENO(σDUR>37(ASG)))
(EMP × ASG))

ΠENAME ΠENAME

σDUR>37 ∧temp

EMP ENO
Temp as
EMP.ENO=ASG.ENO

EMP x ASG
ENO(σDUR>37(ASG)
Statistical Data

Parser Relational
Query & Algebra
Translato
r

 A Statical Data is a
Optimizer
database used for
statistical analysis
purposes.
 It is an OLAP(Online Statistical
Data
Analytical Processing),
instead of OLTP(Online
Transaction Processing)
system
Evaluation Plan

Parser Relational
Query & Algebra
Translato
r

 Relational Algebra annotated with instructions Optimizer


on how to evaluate it is called an evaluation
primitive.
 Sequence of primitive operations that can be
Evaluation Statistical
used to evaluate a query is a query evaluation Plan Data
plan.
EVALUATION &
DATA
Parser Relational
Query & Algebra
Translato
r

Optimizer

 The evaluation engine


takes the evaluation
Evaluation Statistical
plan as Evaluation
Plan Data
condition and
applies it on the  The information on
data. which the query has
to be performed is called data.
Data
OUTPU
T
Parser Relational
Query & Algebra
Translato
r
 After the evaluation of plan
on data, processed Optimizer
information is showed in
output.

Evaluation Evaluation Statistical


Output
Plan Data

Data
Diagram of Query Processing

Parser Relational
Query & Algebra
Translato
r

Optimizer

Evaluation Evaluation Statistical


Output
Plan Data

Data
Measures of Query Cost

The cost of query evaluation can be measured in


terms of different resources, including:
 disk accesses
 CPU time to execute a query in a
distributed or parallel database system
 the cost of communication.
Query Optimization

 It is the process of selecting the most efficient query-


evaluation plan from among the many strategies usually
possible for processing a given query, especially if the
query is complex.
there are many different ways to get an answer from a given
query. The result would be same in all scenarios.
DBMS strive to process the query in the most efficient (in
terms of ‘Time’) to produce the answer
Cost=Time Needed to get all aswers
Query optimization…….
• Query Optimization: A single query can be executed through different algorithms or re-
written in different forms and structures. Hence, the question of query optimization comes
into the picture – Which of these forms or pathways is the most optimal? The query
optimizer attempts to determine the most efficient way to execute a given query by
considering the possible query plans.
• Importance: The goal of query optimization is to reduce the system resources required to
fulfill a query, and ultimately provide the user with the correct result set faster.
it provides the user with faster results, which makes the application seem faster to the user.
it allows the system to service more queries in the same amount of time, because each
request takes less time than unoptimized queries.
reduces the amount of wear on the hardware (e.g. disk drives), and allows the server to run
more efficiently (e.g. lower power consumption, less memory usage).
Steps for Query Optimization

• Query optimization involves three steps namely:


 query tree generation
plan generation
query plan code generation.
Relational Algebra
Relational Query Languages
Query languages: Allow manipulation and retrieval of data from a database.
Relational model supports simple, powerful QLs:
◦ Strong formal foundation based on logic.
◦ Allows for much optimization.

Query Languages != programming languages!


◦ QLs not expected to be “Turing complete”.
◦ QLs not intended to be used for complex calculations.
◦ QLs support easy, efficient access to large data sets.
Formal Relational Query Languages
Two mathematical Query Languages form the basis for “real” languages
(e.g. SQL), and for implementation:
Relational Algebra: More operational, very useful for representing
execution plans.
Relational Calculus: Lets users describe what they want, rather than how
to compute it. (Non-operational, declarative.)

 Understanding Algebra & Calculus is key to


 understanding SQL, query processing!
Preliminaries
A query is applied to relation instances, and the result of a query is also a
relation instance.
◦ Schemas of input relations for a query are fixed (but query will run regardless of
instance!)
◦ The schema for the result of a given query is also fixed! Determined by definition of
query language constructs.

Positional vs. named-field notation:


◦ Positional notation easier for formal definitions, named-field notation more
readable.
◦ Both used in Relational Algebra and SQL
R1 sid bid day
Example Instances 22
58
101 10/10/96
103 11/12/96
“Sailors” and “Reserves” relations
for our examples. S1
sid sname rating age
We’ll use positional or named 22 dustin 7 45.0
field notation, assume that names 31 lubber 8 55.5
of fields in query results are 58 rusty 10 35.0
`inherited’ from names of fields in
query input relations.
S2 sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
Relational
Basic operations:
Algebra
◦ 
Selection ( ) Selects a subset of rows from relation.
◦ 
Projection ( ) Deletes unwanted columns from relation.


Cross-product ( ) Allows us to combine two relations.



Set-difference ( ) Tuples in reln. 1, but not in reln. 2.
Union (  ) Tuples in reln. 1 and in reln. 2.

Additional operations:
◦ Intersection, join, division, renaming: Not essential, but (very!) useful.

Since each operation returns a relation, operations can be composed! (Algebra is


“closed”.)
sname rating
Projection yuppy
lubber
9
8
Deletes attributes that are not in guppy 5
projection list.
rusty 10
Schema of result contains exactly the
fields in the projection list, with the  sname,rating(S2)
same names that they had in the (only)
input relation.
Projection operator has to eliminate
duplicates! (Why??) age
◦ Note: real systems typically don’t do duplicate
elimination unless the user explicitly asks for it.
35.0
(Why not?) 55.5
 age(S2)
sid sname rating age
Selection 28 yuppy 9 35.0
58 rusty 10 35.0
Selects rows that satisfy
selection condition.  rating 8(S2)
No duplicates in result!
(Why?)
Schema of result identical to
schema of (only) input sname rating
relation.
yuppy 9
Result relation can be the
input for another relational
rusty 10
algebra operation! (Operator
composition.)  sname,rating( rating 8(S2))
Union, Intersection, Set-Difference
sid sname rating age
All of these operations take two 22 dustin 7 45.0
input relations, which must be 31 lubber 8 55.5
union-compatible: 58 rusty 10 35.0
◦ Same number of fields.
◦ `Corresponding’ fields have the same
44 guppy 5 35.0
type. 28 yuppy 9 35.0
What is the schema of result? S1 S2

sid sname rating age


sid sname rating age 31 lubber 8 55.5
22 dustin 7 45.0 58 rusty 10 35.0
S1 S2 S1 S2
Figure 14-10 Select operation

 Write the relational algebra expression to achieve the


following:
Figure 14-11 Project operation
 Write the relational algebra expression to achieve the
following:

You might also like