Chapter 13: Query Processing
Chapter 13: Query Processing
Query Processing
• What is Query Processing
• Measures of Query Cost
• Selection Operation
• Sorting
• Join Operation
What is Query Processing?
• Query processing: Activities involved in
extracting data from a database.
– Translation of queries in high-level DB languages
into expressions that can be used at physical level
of file system.
– Includes query optimization and query evaluation.
• Three basic steps:
1. Parsing and Translation
2. Optimization
3. Evaluation
Three Basic Steps in
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
Parsing and translation
• Translate the query into its internal form.
– This is then translated into relational algebra.
• Parser checks syntax, verifies relations.
• A relational algebra expression may have many
equivalent expressions
– E.g., balance2500(balance(account)) is
equivalent to
balance(balance2500(account))
Parsing and translation (cont.)
• Each relational algebra operation can be
evaluated using one of several different
algorithms
• Correspondingly, a relational-algebra
expression can be evaluated in many ways.
• Evaluation-plan: Annotated expression
specifying detailed evaluation strategy.
– e.g., can use an index on balance to find
accounts with balance < 2500,
– or can perform complete relation scan and
discard accounts with balance 2500
Query Optimization
• Alternative ways of evaluating a given query
– Equivalent expressions
– Different algorithms for each operation
Query Optimization
• An evaluation plan defines exactly what algorithm is used for each
operation, and how the execution of the operations is coordinated.
Query Optimization
• Amongst all equivalent evaluation plans
choose the one with lowest cost.
– Cost is estimated using statistical information
from the database catalog
• e.g. number of tuples in each relation, size of
tuples, etc.
• How to measure query costs
• How to optimize queries, that is, how to find an
evaluation plan with lowest estimated cost
Query Optimization
• Estimation of plan cost based on:
– Statistical information about relations.
Examples:
• number of tuples, number of distinct values for
an attribute
– Statistics estimation for intermediate results
• to compute cost of complex expressions
– Cost formulae for algorithms, computed
using statistics
Query Optimization
• Cost difference between evaluation
plans for a query can be enormous
– E.g. seconds vs. days in some cases
• Steps in cost-based query optimization
– Generate logically equivalent expressions
using equivalence rules
– Annotate resultant expressions to get
alternative query plans
– Choose the cheapest plan based on
estimated cost
Evaluation
• The query-execution engine takes a query-
evaluation plan, executes that plan, and
returns the answers to the query.
• Parsed execution plan for previously
executed SQL statements is stored in
Shared pool (a portion of memory or
buffer).
– If a new SQL statement (query) is exactly the same
string as the one in the shared pool, no need to call
optimizer and recalculate the execution plan for the
SQL statement.
Transformation of Relational
Expressions
• Two relational algebra expressions are said
to be equivalent if the two expressions
generate the same set of tuples on every
legal database instance
– Note: order of tuples is irrelevant
• An equivalence rule says that expressions
of two forms are equivalent
– Can replace expression of first form by second,
or vice versa
Equivalence Rules
1. Conjunctive selection operations can
be deconstructed into a sequence of
individual selections.
1 2 ( E ) 1 ( 2 ( E ))