0% found this document useful (0 votes)
1 views

Query Processing and Query Optimization Techniques

The document discusses query processing and optimization techniques in database management systems (DBMS), focusing on measures of query cost, including disk access, CPU execution time, and communication costs. It outlines various optimization strategies such as adding or checking indexes, minimizing OR conditions, and employing cost-based and heuristic optimization methods to enhance query performance. Additionally, it highlights the challenges of heuristic optimization, including potential suboptimality and the importance of accurate cost estimation.

Uploaded by

utkarx10106
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Query Processing and Query Optimization Techniques

The document discusses query processing and optimization techniques in database management systems (DBMS), focusing on measures of query cost, including disk access, CPU execution time, and communication costs. It outlines various optimization strategies such as adding or checking indexes, minimizing OR conditions, and employing cost-based and heuristic optimization methods to enhance query performance. Additionally, it highlights the challenges of heuristic optimization, including potential suboptimality and the importance of accurate cost estimation.

Uploaded by

utkarx10106
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Query Processing

and Query
Optimization
Techniques
Measures of Query Cost in DBMS
Measures of Query Cost in DBMS
• Query Cost is a cost in which the enhancer
considers what amount of time your query will
require (comparative with absolute clump time).
• Then the analyzer attempts to pick the most ideal
query plan by taking a glance at your inquiry and
insights of your information, attempting a few
execution designs, and choosing the most
inexpensive of them.
• The measures of query cost in DBMS can be done
by creating a framework that can make numerous
designs for an inquiry.
• It tends to be finished by the means of contrasting
Conti…
• For working out the net assessed cost of any arrangement, the
expense of every activity inside an arrangement ought to be set
in a deterministic and consolidated cost to get the net assessed
cost of the query assessment plan.
• Example: We utilize the number of square exchanges that is
basically the block from the disk and the quantity of the disk
seeks to appraise the expense of a query assessment plan.
• Assuming that the disk subsystem takes a normal of tT seconds to
move a square of information and has a normal block access time
(disk lookup time in addition to rotational idleness) of tS seconds
• Then, at that point, an activity that moves b obstructs and
performs S looks for would take b ∗ tT + S ∗ tS seconds.
• tT – time to transfer one block
• tS – time for one to seek
• Cost for b block transfers plus S seeks
• b * tT + S * tS
• The upsides of tT and tS should be aligned for the disk framework
utilization, however, normal qualities for top-end disk today would
be tS = 4 milliseconds and tT = 0.1 milliseconds, expecting a 4-
kilobyte block size and an exchange pace of 40 megabytes each
second.
The expense assessment of a query assessment plan is
determined by keeping in mind the different assets that follow as:

• The number of disk accesses.


• Time of Execution taken by the CPU to execute a query.
• The involved Communication costs in either distributed or
parallel database systems.
To gauge the expense of a query assessment plan, utilizing the
number of blocks moved from the disk, and the quantity of disk
seeks looks for. For the most part, for assessing the expense, we
must consider the most pessimistic scenario that could occur
which is the worst-case scenario.
Query Optimization Techniques

• The process of selecting the most effective way to carry out a


SQL statement is known as query optimization.
• Due to SQL’s non-procedural nature, the optimizer is permitted
to merge, rearrange, and process data in any sequence.
• Based on statistics gathered regarding the accessed data, the
database optimizes each SQL statement.
• The optimizer evaluates various access techniques, such as full
table scans or index scans, various join techniques, such as
nested loop joins and hash joins, various join orders, and
potential transformations to identify the best plan for a SQL
query.
SQL Query Optimization Techniques

• Add missing indexes:- Adding missing indexes in the


SQL database can improve query performance. Think of
indexes as a roadmap for the database engine to quickly
locate data.
• Check for unused indexes:- Checking for unused
indexes is an important aspect of SQL query optimization.
While indexes help to deliver speed during database
operations, having too many or unnecessary indexes
consumes storage space, which again slows down data
modification operations like inserts, updates, and deletes.
• Avoid using multiple OR in the FILTER predicate:-
Minimizing the use of multiple OR conditions within the
FILTER predicate is a significant strategy in optimizing
SQL queries. When numerous OR conditions are used, the
database engine has to evaluate each condition
individually that impacts query performance.

SELECT column1, column2, …. WHERE condition_column = ‘value’


AND (another_condition_column = ‘value1’ OR
another_condition_column = ‘value2’);
• Avoid too many JOINs
SELECT column1, column2, … FROM
• Avoid using SELECT DISTINCT
your_table GROUP BY column1, column2, …

• Use SELECT fields instead of SELECT column1, column2, … FROM


SELECT * your_table

• Use TOP to sample query results


SELECT TOP 10 column1, column2, … FROM your_table
Cost Based optimization

• Query optimization is the process of choosing the most


efficient or the most favorable type of executing an SQL
statement.
• Query optimization is an art of science for applying rules
to rewrite the tree of operators that is invoked in a query
and to produce an optimal plan.
• A plan is said to be optimal if it returns the answer in the
least time or by using the least space.
• For a given query and environment, the Optimizer
allocates a cost in numerical form which is related to each
step of a possible plan and then finds these values
together to get a cost estimate for the plan or for the
possible strategy.
• After calculating the costs of all possible plans, the
Optimizer tries to choose a plan which will have the
possible lowest cost estimate. For that reason, the
Optimizer may be sometimes referred to as the Cost-
Based Optimizer. Below are some of the features of the
cost-based optimization-
• The cost-based optimization is based on the cost of the query
that to be optimized.
• The query can use a lot of paths based on the value of indexes,
available sorting methods, constraints, etc.
• The aim of query optimization is to choose the most efficient
path of implementing the query at the possible lowest minimum
cost in the form of an algorithm.
• The cost of executing the algorithm needs to be provided by the
query Optimizer so that the most suitable query can be selected
for an operation.
• The cost of an algorithm also depends upon the cardinality of
the input.
Cost Estimation:
To estimate the cost of different available execution plans or the execution
strategies the query tree is viewed and studied as a data structure that contains a
series of basic operation which are linked in order to perform the query. The cost of
optimization of the query depends upon the following-
• Cardinality-
Cardinality is known to be the number of rows that are returned by performing
the operations specified by the query execution plan. The estimates of the
cardinality must be correct as it highly affects all the possibilities of the execution
plan.
• Selectivity-
Selectivity refers to the number of rows that are selected. The selectivity of any
row from the table or any table from the database almost depends upon the
condition. The satisfaction of the condition takes us to the selectivity of that
specific row. The condition that is to be satisfied can be any, depending upon the
situation.
• Cost-
Cost refers to the amount of money spent on the system to optimize the system.
The measure of cost fully depends upon the work done or the number of
resources used.
Cost Components Of Query
Execution:
• Access cost to secondary storage-
This can be the cost of searching, reading, or writing data
blocks that originally found on the secondary storage,
especially on the disk. The cost of searching for records in
a file also depends upon the type of access structure that
file has.
• Memory usage cost-
The cost of memory usage can be calculated simply by
using the number of memory buffers that are needed for
the execution of the query.
• Storage cost-
The storage cost is the cost of storing any intermediate files(files
that are the result of processing the input but are not exactly the
result) that are generated by the execution strategy for the query.
• Computational cost-
This is the cost of performing the memory operations that are
available on the record within the data buffers. Operations like
searching for records, merging records, or sorting records. This
can also be called the CPU cost.
• Communication cost-
This is the cost that is associated with sending or communicating
the query and its results from one place to another. It also
includes the cost of transferring the table and results to the
various sites during the process of query evaluation.
Heuristic Optimization
• Database management systems (DBMS) use optimization
techniques to help better yielding queries and improve
overall system performance in the world of DBMS.
• Among those techniques, heuristic optimization can be
considered the leading one, which utilizes table of thumb,
oral communication, and a very simple method rather
than a complex one for instant optimization on query
execution plans.
• In DBMS, heuristic optimization is a procedure that is
aimed at the rapid exploration of almost all execution
plans in a quick and efficient way.
• the heuristic optimization rules, with the use of derived
empirical knowledge and approximate algorithms, tend to
speed up the process.
• Through the implementation of heuristics, the DBMS
optimizers spur efficient execution query plan
convergence while also minimizing computational cost.
Key Components of Heuristic Optimization

• 1. Cost-Based Heuristics: Cost estimation is a critical process


in DBMS, where heuristic strategies are applied. We do this in
the future related to the query execution plan of the selection
cost approach using statistical data distribution, system
parameters, and the physical machine characteristics.
• e.g., line-based cost estimation and cardinality supposition are
principally used to determine the cost correlated with each plan
next. Through utilization of service plans with lower estimated
overheads by heuristic optimization, budgets are managed
judiciously, and the overall query is executed with the best
performance.
• 2. Join Order Heuristics: When there are queries involving
multiple tables, deciding on what optimum join order to use
becomes necessary in order to decrease request execution time.
Join ordering heuristics can include the usage of greedy
algorithms, dynamic programming approaches, and others.
• Index Selection Heuristics: In addition to extraction and load
of data, indexes have the ability to accelerate query processing
by making data retrieval fast. Index selection heuristics
encompass the usage of the heuristic optimization which can
choose the most influential indexes for query execution among
them.
• Query Rewriting Heuristics: The use of heuristic optimization
is focused on the rewriting and transformation of the statements
into semantically corresponding ones that are suitable for
optimal query execution.
Challenges and Considerations

• While heuristic optimization offers significant advantages in terms of speed and


efficiency, it is not without its challenges and considerations:

• Suboptimality: Although the heuristic approaches sometimes can yield plans with
subpar query execution compared to the exhaustive precision optimization, they
again remove the need to follow exact heuristic rules and use only heuristics that
are broadly applicable, so they can be useful for many queries.

• Cost Estimation Accuracy: The outcome of heuristic optimization, however,


relies on the accuracy of the cost estimations which in turn may be affected by the
scalability of the data, query complexity or system dynamics, among other factors.

• Trade-offs: Heuristic optimization introduces a balance between optimality and


efficiency, ramifying the problem of proportionality between speed and quality of
the planning.

You might also like