0% found this document useful (0 votes)
6 views6 pages

Sudhansu, DBMS 3rd

The document is an assignment on Database Management Systems (DBMS) that covers query processing and optimization, evaluation of relational algebra expressions, query equivalence, and query optimization algorithms. It explains the three steps of query processing: parsing, optimization, and execution, along with methods like materialization and pipelining for evaluating relational algebra. Additionally, it discusses the significance of query optimization in enhancing performance and resource efficiency in database systems.

Uploaded by

sudhansu maurya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views6 pages

Sudhansu, DBMS 3rd

The document is an assignment on Database Management Systems (DBMS) that covers query processing and optimization, evaluation of relational algebra expressions, query equivalence, and query optimization algorithms. It explains the three steps of query processing: parsing, optimization, and execution, along with methods like materialization and pipelining for evaluating relational algebra. Additionally, it discusses the significance of query optimization in enhancing performance and resource efficiency in database systems.

Uploaded by

sudhansu maurya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

GIANI ZAIL SINGH CAMPUS COLLEGE

ENGINEERING & TECHNOLOGY


Bathinda,Punjab

Assignment of DBMS-3rd

Name:-sudhansu

Branch: C S E ‘B’

Batch :-2k18

Roll no : 180280073/184075

Submitted to:-
Er.Shimardeep mam

Q1: What do you mean by Query Processing and optimization?

Ans:-
Query processing includes translation of high-level queries into low-level
expressions that can be used at the physical level of the file system, query
optimization and actual execution of the query to get the result. It is a three-step
process that consists of parsing and translation, optimization and execution of the
query submitted by the user. These steps are discussed below:

• Step-1:

Parser: During parse call, the database performs the following checks- Syntax
check, Semantic check and Shared pool check, after converting the query into
relational algebra.
Parser performs the following checks as (refer detailed diagram):
• Syntax check – concludes SQL syntactic validity. Example: SELECT *
FORM employee
Here the error of wrong spelling of FROM is given by this check.
• Semantic check – determines whether the statement is meaningful or not.
Example: query contains a table name which does not exist is checked by this
check.

• Shared Pool check – Every query possesses a hash code during its
execution. So, this check determines the existence of written hash code in a
shared pool. If code exists in a shared pool then the database will not take
additional steps for optimization and execution.

Hard Parse and Soft Parse –


If there is a fresh query and its hash code does not exist in shared pool then that
query has to pass through from the additional steps known as hard parsing
otherwise if hash code exists then query does not passes through additional steps. It
just passes directly to execution engine (refer detailed diagram). This is known as
soft parsing.
Hard Parse includes following steps – Optimizer and Row source generation.

• Step-2:

Optimizer: During optimization stage, database must perform a hard parse at least
for one unique DML statement and perform optimization during this parse. This
database never optimizes DDL unless it includes a DML component such as
subquery that requires optimization.
It is a process in which multiple query execution plans for satisfying a query are
examined and the most efficient query plan is satisfied for execution.
Database catalog stores the execution plans and then optimizer passes the lowest
cost plan for execution.
Row Source Generation –
The Row Source Generation is a software that receives a optimal execution plan
from the optimizer and produces an iterative execution plan that is usable by the rest
of the database. the iterative plan is the binary program that when executes by the
sql engine produces the result set.

• Step-3:
Execution Engine: Finally runs the query and display the required result.

Q2: Explain evaluation of relational algebra expressions ?

Ans:-
Relational algebra is a procedural query language, which takes instances of
relations as input and yields instances of relations as output. It uses
operators to perform queries. An operator can be either unary or binary. They
accept relations as their input and yield relations as their output.
When a query is placed, it is at first scanned, parsed and validated. An
internal representation of the query is then created such as a query tree or a
query graph. Then alternative execution strategies are devised for retrieving
results from the database tables.

Materialization :In this method, the given expression evaluates one


relational operation at a time. Also, each operation is evaluated in
an appropriate sequence or order. After evaluating all the operations, the
outputs are materialized in a temporary relation for their subsequent uses.
It leads the materialization method to a disadvantage. The disadvantage
is that it needs to construct those temporary relations for materializing the
results of the evaluated operations, respectively. These temporary
relations are written on the disks unless they are small in size.

Pipelining : Pipelining is an alternate method or approach to the


materialization method. In pipelining, it enables us to evaluate each
relational operation of the expression simultaneously in a pipeline. In this
approach, after evaluating one operation, its output is passed on to the
next operation, and the chain continues till all the relational operations are
evaluated thoroughly. Thus, there is no requirement of storing a
temporary relation in pipelining. Such an advantage of pipelining makes it
a better approach as compared to the approach used in the
materialization method. Even the costs of both approaches can have
subsequent differences in-between. But, both approaches perform the
best role in different cases. Thus, both ways are feasible at their place.
Q3. Define Query equivalence?
Ans:-
Any two relational expressions are said to be equivalent, if both the expressions
generate the same set of records. When two expressions are equivalent we can
use them interchangeably. i.e.; we can use either of the expressions whichever
gives better performance.

We can have different equivalent expressions for different types of operations. The
Equivalence Rule defines how to write equivalent expressions for each of the operators.

Q4. Explain query optimization algorithms ?


Ans:
Query: A query is a request for information from a database.

Query Optimization : A single query can be executed through different algorithms


or re-written in different forms and structures. Hence, the question of query
optimization comes into the picture – Which of these forms or pathways is the most
optimal? The query optimizer attempts to determine the most efficient way to
execute a given query by considering the possible query plans.
Importance: The goal of query optimization is to reduce the system resources
required to fulfill a query, and ultimately provide the user with the correct result set
faster.
• First, it provides the user with faster results, which makes the application
seem faster to the user.
• Secondly, it allows the system to service more queries in the same
amount of time, because each request takes less time than unoptimized queries.
• Thirdly, query optimization ultimately reduces the amount of wear on the
hardware (e.g. disk drives), and allows the server to run more efficiently (e.g.
lower power consumption, less memory usage).

There are broadly two ways a query can be optimized:


• Analyze and transform equivalent relational expressions: Try to minimize
the tuple and column counts of the intermediate and final query processes
(discussed here).
• Using different algorithms for each operation: These underlying algorithms
determine how tuples are accessed from the data structures they are stored in,
indexing, hashing, data retrieval and hence influence the number of disk and
block accesses (discussed in query processing).

Cost-Based Optimization (aka Cost-Based Query Optimization or CBO Optimizer) is


an optimization technique in Spark SQL that uses table statistics to determine the
most efficient query execution plan of a structured query (given the logical query
plan). Cost-based optimization is disabled by default.

You might also like