0% found this document useful (0 votes)
5 views7 pages

Unit 2 Query Plan

The document discusses query plan generation and optimization techniques in relational databases, emphasizing heuristic-based optimization and distributed query processing. It outlines the importance of execution plans, the role of the PLAN_TABLE for output, and various SQL operations like selection, Cartesian product, and joins. Additionally, it provides examples of size estimation for different SQL operations and highlights factors affecting execution plans and costs.

Uploaded by

entrib02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views7 pages

Unit 2 Query Plan

The document discusses query plan generation and optimization techniques in relational databases, emphasizing heuristic-based optimization and distributed query processing. It outlines the importance of execution plans, the role of the PLAN_TABLE for output, and various SQL operations like selection, Cartesian product, and joins. Additionally, it provides examples of size estimation for different SQL operations and highlights factors affecting execution plans and costs.

Uploaded by

entrib02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

EMPLOYEE

EmpID EName Salary DeptNo DateOfJoining

DEPARTMENT

DNo DName Location


Step 2 − Query Plan Generation
After the query tree is generated, a query plan is made. A query plan is an extended
query tree that includes access paths for all operations in the query tree. Access paths
specify how the relational operations in the tree should be performed.
Heuristic Based Optimization
Heuristic based optimization uses rule-based optimization approaches for query
optimization. These algorithms have polynomial time and space complexity, which is
lower than the exponential complexity of exhaustive search-based algorithms.
However, these algorithms do not necessarily produce the best query plan.
Some of the common heuristic rules are −
 Perform select and project operations before join operations. This is done by
moving the select and project operations down the query tree. This reduces the
number of tuples available for join.
 Perform the most restrictive select/project operations at first before the other
operations.
 Avoid cross-product operation since they result in very large-sized intermediate
tables.
Distributed Query Processing Architecture

Distributed Query Optimization


Distributed query optimization requires evaluation of a large number of query trees
each of which produce the required results of a query. This is primarily due to the
presence of large amount of replicated and fragmented data. Hence, the target is to
find an optimal solution instead of the best solution.
The main issues for distributed query optimization are −

 Optimal utilization of resources in the distributed system.


 Query trading.
 Reduction of solution space of the query.

Execution plans can differ due to the following:

 Different Schemas
 Different Costs
Different Schemas

 The execution and explain plan happen on different databases.


 The user explaining the statement is different from the user running the statement. Two
users might be pointing to different objects in the same database, resulting in different
execution plans.
 Schema changes (usually changes in indexes) between the two operations.

Different Costs

Even if the schemas are the same, the optimizer can choose different execution plans if the costs
are different. Some factors that affect the costs include the following:

 Data volume and statistics


 Bind variable types and values
 Initialization parameters - set globally or at session level

The PLAN_TABLE Output Table

The PLAN_TABLE is automatically created as a global temporary table to hold the output of
an EXPLAIN PLAN statement for all users. PLAN_TABLE is the default sample output table
into which the EXPLAIN PLAN statement inserts rows describing execution plans

create table employees


(firstname varchar(15),
lastname varchar(20),
age number(3),
address varchar(30),
city varchar(20),
state varchar(20));

insert into employees2

(firstname, lastname, age, address, city, state)

values ('Luke', 'Duke', 45, '2130 Boars Nest',

'Hazard Co', 'Georgia');


To explain a SQL statement, use the EXPLAIN PLAN FOR clause immediately before the
statement. For example:

EXPLAIN PLAN FOR

SELECT lastname FROM employees1;

Display

SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());

EXPLAIN PLAN

SET STATEMENT_ID = 'st1' FOR

SELECT lastname FROM employees1;

Display output:

SELECT PLAN_TABLE_OUTPUT

FROM TABLE(DBMS_XPLAN.DISPLAY('PLAN_TABLE', 'st1','TYPICAL'));

https://docs.oracle.com/cd/B19306_01/server.102/b14211/ex_plan.htm#i21501

https://docs.oracle.com/cd/B19306_01/server.102/b14211/ex_plan.htm#i17492

Selection
For example, suppose column age has max value 25 and min value 18. We have
selection condition, age<= 20. Total number of records in table is 100. Then size
estimate is calculated as,

100 * (20 – 18) / (25-18) ≈ 29

i.e.; approximately 29 records can be fetched from the table with age <=20. Suppose
the condition is age<15, then the probability of getting such records is zero; no records
are present with such age.

Cartesian Product

suppose we have 100 records in EMP and 5 records in DEPT. Say selectivity of table
EMP is 5 (columns) and DEPT is 3 (columns). Then cartesian product will have 500
records into the result set, copying 8 bytes of each record.

Left Outer Join

For example, assume left outer join between EMP and PROJECT. Let EMP has 1000
records and PROJECT has 30 records.

The left outer join between these two tables will have normal join between them for the
matching records and all other records from EMP for which there is no match in
PROJECT.

Hence its size estimate is given as (1000 * 30) + 1000 = 31000


In the case of right outer join, above size estimate would be (1000 * 30) + 30 = 30030
and in the case of full outer join, it is (1000 * 30) + 1000 + 30= 31030.

Projection

This is the operation of selecting particular column/s from a table. Hence its size
estimate is equal to number of distinct values of column/s present in the table.

Size estimate of selecting the column EMP_ID in EMP table is equal to total number of
records in EMP, since EMP_ID is the primary key. Let us assume there are only 6
different values for AGE in EMP. Then size estimate for projecting AGE from EMP is 6.

Aggregation

This is similar to projection. Here distinct values of column are grouped and their
aggregate values (like count, sum, max, min etc) are calculated. Hence its size estimate
is equal to number of distinct values of column present in the table.

Size estimate for calculating number of employees in different department is equal to


total number of distinct department present in the table.

You might also like