DBMS Chapter 7
DBMS Chapter 7
Query Optimization
Query Processing
Query Processing refers to range of activities involved in extracting data from a database.
The basic steps involved in processing of a query are
1. Parsing and translation
2. Optimization
3. Evaluation
Parser and
Query Relation Algebra
Translator
Optimizer
Evaluation
Query Execution Plan Database
Engine
Statistics
(Data
Dictionary)
Data Data
Example:
Suppose we have the relational algebra expression as below.
- ∏customer-name(σbranch-city = ‘ktm’ ˄ balance > 1000( branch account depositer))
Using rule no 4a we can have equivalent expression as below
- ∏customer-name((σbranch-city = ‘ktm’ ˄ balance > 1000( branch account) depositer))
Using rule no 4c, we can have another equivalent expression as below
- ∏customer-name((σbranch-city = ‘ktm’ (Branch) σ balance > 1000(account) depositer))
Operator Tree
The relational algebra query can be represented graphically for simplicity by an operator tree. An operator tree
is a tree in which leaf node is a relation stored in the database and a non-leaf node is a intermediate relation
produce by a relational algebra operator. The sequence of operations is directed from leaves to the root, which
represents the answer to the query.
σc
≡
σc E2
E1 E2
E1
σcourse-name=’DBMS’
Student
Registration Course
Query Optimization
It is the process of selecting the most efficient query execution plan among the many strategies possible for
processing a query. The query optimizer is very important component of a database system because the
efficiency of the system depends on the performance of the optimizer. The selected plan minimizes the cost
function.
Query optimization refers to the process of producing a query execution plan which represents an execution
strategy for the query. The selected plan minimizes an object cost function.
Steps of optimization
1. Create an initial operator (expression) tree.
2. Move select operation down the tree for the easiest possible execution.
3. Applying more restrictive select operation first.
4. Replace Cartesian product by join.
5. Creating new projection whenever needed.
6. Adjusting rest of the tree accordingly.
Example1:
The following query retrieves the customer name from branch city pokhara whose balance is greater then
1000
Select customer-name from Branch, Account, Depositor where city= ‘Pkr’ and balance >1000
To process the above query, there are number of evaluation plan in which the above query can be processed.
1. Join relation Branch and Account, join the result with Depositor and then do the restriction.
2. Join the relation Branch and Account, do the restrictions and then join the result with Depositor.
3. Do the restriction, join the relations Branch and Account, and join the result with Depositor.
The query optimizer estimates cost for each of the plan and choose the best way to process the query.
Let us consider the following algebraic expression
∏customer-name(σcity=’Pkr’˄ balance>1000 (Branch Account Depositor))
Depositor
Branch
σcity=’Pkr’ σbalance>1000
Account Depositor
Branch Account
Exmple2:
Suppose we are given the following table definitions with the certain records in each table.
PROJ (PNO, PNAME, BUDGET)
EMP(ENO, ENAME, TITLE)
ASG(ENO, PNO, DUR)
Write the sql statement and RA expression: “Find the names of employees other than Ram Thapa who worked
on CAD/CAM project for either 1 or 2 years”. Construct initial operator tree and final efficient operator
tree after applying transformation rules.
:
SQL:
select ENAME from EMP, ASG, PROJ where EMP.ENO=ASG.ENO and ASG.PNO=PROJ.PNO and
ENAME != ‘Ram Thapa’ and PNAME=’CAD/CAM’ and (DUR = 1 or DUR =2)
RA:
∏ENAME(σENAME ≠ ‘Ram Thapa’ PNAM = ‘CAD/CAM’ (DUR =1 DUR = 2)(PROJ⋈ (EMP ⋈ ASG)))
Initial Operator tree
∏ENAME (Project)
Final operator tree (a more efficient query evaluation tree, since more selective operations are performed
first)