0% found this document useful (0 votes)

20 views72 pages

ch2 PDF

Uploaded by

Hana Yaregal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views72 pages

ch2 PDF

Uploaded by

Hana Yaregal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 72

Chapter 2

Query Processing and

Optimization

12/10/2024
Query processing & optimization 1
Query Processing…
 Refers to the range of activities involved in
extracting data from a database.
 This includes translation of high –level queries into
low level expressions that can be used:
 at physical level of the file system,
 query optimization and
actual execution of the query to get the result.
12/10/2024
Query processing & 2
optimization
Query Processing & Optimization
Process
Query Scanner Parser Internal
representation

Execution
Strategies
DBMS

Answer Data
Optimizer

Runtime Database Code Execution

Processor Generator plan

12/10/2024
Query processing & 3
optimization
Query Processing…
 The scanner identifies the query tokens such as
 SQL keywords,
 attribute names, and
 relation names—that appear in the text of the query.

 The parser checks the query syntax to determine

whether it is formulated according to the syntax rules
(rules of grammar) of the query language.
 The query must also be validated by checking that all
attribute and relation names are valid and semantically
meaningful names in the schema of the particular
database being queried.

12/10/2024
Query processing & 4
optimization
Query Processing…
 An internal representation of the query is then created,

usually as a tree data structure called a query tree.

 The DBMS must then devise an execution strategy for

retrieving the results of the query from the database

files.
 A query typically has many possible execution

strategies, and the process of choosing a suitable one for

processing
12/10/2024
a query is known as query optimization.
Query processing & 5
optimization
Query Processing…
 The query optimizer module has the task of producing
a good execution plan.
 The code generator generates the code to execute that
plan.
 The runtime database processor has the task of
running (executing) the query code, whether in
compiled or interpreted mode, to produce the query
result.
 If a runtime error results, an error message is generated
by the runtime database processor.

12/10/2024
Query processing & 6
optimization
Query Processing…Query tree
 sql query

Select balance
From account
Where balance > 2500

 Relational algebra expression

 Bala( bala>2500(Account))

  bala>2500(Bala (Account))

 12/10/2024
both are equivalent query i.e. display the same results.
Query processing & 7
optimization
Generally, Basic Steps in Query Processing

There are three phases that a query passes through during

the DBMS’ processing of that query:
 Parsing and translation
 Optimization

 Evaluation

12/10/2024
Query processing & 3-8
optimization
Steps in Query Processing
1. Parsing and translation
 translate the query into its internal form.
 This is then translated into relational algebra.

 Parser checks syntax,

 verifies relations

2. Optimization
 The query optimizer translates a relational algebra expression
into an execution plan.
 A relational algebra expression may have many equivalent
expressions, each of which gives rise to a different evaluation
plan.
 Amongst all equivalent evaluation plans choose the one with
lowest cost.
12/10/2024
Query processing & 3-9
optimization
2.Optimization….
 Annotated expression specifying detailed evaluation
strategy is called an evaluation-plan.
 Query Optimization: Amongst all equivalent evaluation
plans choose the one with lowest cost.
 balance2500(balance(account))
 balance(balance2500(account))

 Cost is estimated using statistical information from the

database catalog
 e.g. number of tuples in each relation, size of tuples, etc.
 Total cost= CPU cost + I/O cost + communication cost

12/10/2024
Query processing & 3-10
optimization
Operations and Costs
 Operations: σ, π, , , -, x,
 Costs:
N : number of records in R
R

L : size of record in R
R

F : blocking factor
R
• number of records in page
B : number of pages to store relation R
R
 V(A,R):number of distinct values of attribute A in R
 SC(A,R): selection cardinality of A in R
• A key: S(A,R)=1
• A nonkey: S(A,R)= NR / V(A,R)
 HT : number of levels in index I
i
 rounding up fractions and logarithms
12/10/2024
Query processing & 3-11
optimization
Steps in Query Processing
3) Evaluation
 The query-execution engine takes a query-
evaluation plan,
• executes that plan, and
• returns the answers to the query.

12/10/2024
Query processing & 3-13
optimization
Query Processing vs.
Optimization
 Query Processing

 How to measure query costs

 Algorithms for evaluating relational algebra operations
 How to combine algorithms for individual operations in order
to evaluate a complete expression

 Query Optimization

 We study how to optimize queries, that is, how to find an

evaluation plan with lowest estimated cost
12/10/2024
Query processing & 3-14
optimization
Query tree
 a tree data structure that corresponds to a relational
algebra expression.
 It represents the input relations of the query as leaf
nodes of the tree, and
 represents the relational algebra operations as internal
nodes.
 An execution of the query tree consists of executing an
internal node operation whenever its operands are
available and
 then replacing that internal node by the relation that
results from executing the operation.
12/10/2024
Query processing & 3-15
optimization
Relational Algebra: Project
 
<attr list> (R)
 <attr list> is a list of attributes (columns) from R only
 Ex: 
title, year, length (Movie) “horizontal restriction”
A1 A 2 A 3 … A n A1 A2… Ak


i j
...

...
n k, n  k
 PROJECT can produce many tuples with same value

Relational algebra semantics says remove duplicates

SQL does not -- one difference between formal and actual query languages
12/10/2024
Query processing & 16
optimization
Relational Algebra: Select
 
<predicate> (R)
 <predicate> is a conditional expression of the type
that we are familiar with from conventional
programming languages
 <attribute> <op> <attribute>
 <attribute> <op> <constant>
 attribute in R
 op  {=,,<,>,, …, AND, OR}

Ex: length100 (Movie) vertical restriction

12/10/2024
Query processing & 17
optimization
Pictorially
Movie
title year length filmType
Star Wars 1977 124 color
Mighty result set
1991 104 color
Ducks
Wayne’s
1992 95 color
World

A1 A 2 A 3 … A n A1 A 2 A 3 … A n


i j, i  j
...

...
# of selected tuples is referred to as the selectivity of the condition
12/10/2024
Query processing & 18
optimization
Cartesian Product
 RxS
 Sets of all pairs that can be formed by choosing the first
element of the pair to be any element of R, the second
any element of S.
 Resulting schema may be ambiguous
 Use R.A or S.A to disambiguate an attribute that occurs in both
schemas
 A R.B S.B C D
R S 1 2 2 5 6
A B B C D 1 2 4 7 8
1 2 2 5 6 1 2 9 10 11
x
3 4 4 7 8 3 4 2 5 6
9 10 11 3 4 4 7 8
12/10/2024 3 4Query9processing
10 11& 19
optimization
Join Operations
 R join S
 Match only those tuples from R and S that agree in
whatever attributes are common to the schemas of R
and S
 If r and s from r(R) and s(S) are successfully paired, result is
called a joined tuple
 This join operation is the same we used in earlier
section to recombine relations that had been projected
onto two subsets of their attributes (e.g., as a result of a
BCNF decomposition)
12/10/2024
Query processing & 20
optimization
Example
R S
A B B C D A B C D
1 2 2 5 6 1 2 5 6
join
3 4 4 7 8 3 4 7 8
9 10 11

 Resulting schema has attributes from R, either R or S

(i.e., joining attribute(s)), and STuples that fail to pair

with any tuple of the other relation are called dangling
tuples
12/10/2024
Query processing & 21
optimization
Query Processing(Overview)
We will focus on SPJ, or Select-Project-Join Query

Select <attribute list>

From <relation list>
Where <condition list>
Example Filter Query over R(A,B,C):
Select B
From R
Where R.A = “c”  R.C > 10
12/10/2024
Query processing & 3-22
optimization
SQL Primer (contd.)
We will focus on SPJ..

Select <attribute list>

From <relation list>
Where <condition list>
Example Join Query over R(A,B,C) and S(C,D,E):
Select B, D
From R, S
Where R.A = “c”  S.E = 2  R.C = S.C

12/10/2024
Query processing & 3-23
optimization
Relational Algebra - can be used to
describe plans
  [s
B,D R.A=“c” S.E=2  R.C = S.C (RXS)]

B,D

sR.A=“c” S.E=2  R.C=S.C

X
R S

12/10/2024
Query processing & 3-24
optimization
General syntax query parser
 Translating SQL into Relational Algebra

Possible SQL Query:

SELECT A1…..An
FROM R1..,…Rk
WHERE P

Possible Relational Algebra Query:

 A1,,,,Anp( R1 x,….Rk))
12/10/2024
Query processing & 3-25
optimization
Tree Representation of
Relational Algebra
 A1,,,,Anp( R1 x,….Rk))
 A1,,,An

P

x Rx

x
R1 R3
R2
12/10/2024
Query processing & 3-26
optimization
query parser Example
SELECT balance
FROM account
WHERE balance<2500

Possible Relational Algebra Query:

balancebalance<2500(account))  balance

Tree Representation of Relational Algebra

balance<2500

account
12/10/2024
Query processing & 3-27
optimization
Making An Evaluation Plan
 Query Evaluation Plan (or simply Plan): A Tree of
Relational Algebra operators (essentially σ-π-join
[ basic block ], while rest operators are carried out on
the result) with choice of algorithm for each operator .
 An evaluation plan defines exactly what
algorithm is used for each operation, and how
the execution of the operations is coordinated
 Query Plan presents a specific order of operations
for executing a query.

12/10/2024
Query processing & 3-28
optimization
Query Evaluation Plan
 Used to fully specify how to evaluate a query, each
operation in the query tree is annotated with instructions
which specify the algorithm or the index to be used to
evaluate that operation.
 Query Optimization: Amongst all equivalent evaluation
plans choose the one with lowest cost.
 Cost is estimated using statistical information from the
database catalog
 e.g. number of tuples in each relation, size of tuples, etc.

12/10/2024
Query processing & 3-29
optimization
Query Evaluation
 How to evaluate individual relational operation?

 Selection: find a subset of rows in a table

 Join: connecting tuples from two tables
 Other operations: union, projection, …

 How to estimate cost of individual operation?

 How does available buffer affect the cost?

 How to evaluate a relational algebraic expression?

12/10/2024
Query processing & 30
optimization
Relational Algebra - can be used to
describe plans

B,D

sR.A=“c” S.E=2  R.C=S.C

X
R S

OR: B,D [ sR.A=“c” S.E=2  R.C = S.C (RXS)]

12/10/2024
Query processing & 3-31
optimization
An Example (cont.)
 Plan 2
 Select R tuples with R.A=“c”
 Select S tuples with S.E=2
 Natural join
 Project B & D

 Algebra expression B,D

B,D( R.A=“c” (R) S.E=2 (S))

R.A=‘c’ S.E=2

R S

12/10/2024
Query processing & 32
optimization
methods of query optimization
 There are two methods of query optimization.
 Cost based Optimization (Physical)
 This is based on the cost of the query.
 The query can use different paths based on indexes,
constraints, sorting methods etc.
 This method mainly uses the statistics like :
• record size,
• number of records,
• number of records per block,
• number of blocks,
• table size,
• whether whole table fits in a block,
• organization of tables,
• uniqueness of column values,
•
12/10/2024 size of columns etc
Query processing & 3-33
optimization
methods of query optimization
 Heuristic Optimization (Logical)
 This method is also known as rule based optimization.
 This is based on the equivalence rule on relational
expressions;
• hence the number of combination of queries get reduces here.
• Hence the cost of the query too reduces.
 This method creates relational tree for the given query based
on the equivalence rules.
 These equivalence rules by providing an alternative way of
writing and evaluating the query, gives the better path to
evaluate the query.
 This rule need not be true in all cases.
 It needs to be examined after applying those rules.

12/10/2024
Query processing & 3-34
optimization
Example
SELECT Lname
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE Pname=‘Aquarius’AND Pnumber=Pno AND Essn=Ssn AND Bdate> ‘1957-12-31’;

 Steps in converting a query tree during heuristic optimization.

A. Initial (canonical) query tree for SQL query Q.

B. First, move SELECT operations down the query tree
C. Second, perform the more restrictive SELECT operations first
D. Third, replace CARTESIAN PRODUCT and SELECT
combinations with JOIN operations
E. Finally, move PROJECT operations down the query tree

 This is called heuristic optimization

12/10/2024
Query processing & 3-35
optimization
Query Tree

12/10/2024
Query processing & 3-36
optimization
Cont…

12/10/2024
Query processing & 3-37
optimization
Cont..

 As the preceding example demonstrates, a query tree can be transformed step by

step into an equivalent query tree that is more efficient to execute.
 To do this, the query optimizer must know which transformation rules preserve
this equivalence.
 We discuss some of these transformation rules next.
12/10/2024
Query processing & 3-38
optimization
General Transformation Rules for Relational
Algebra Operations

 There are many rules for transforming relational algebra operations

into equivalent ones.

 We will state some transformation rules that are useful in query

optimization, without proving them:

12/10/2024
Query processing & 3-39
optimization
Cont…
1. Conjunctive selection operations can be deconstructed into a sequence of
individual selections.

2. Selection operations are commutative.

3. Only the last in a sequence of projection a operations is needed, the others

can be omitted.

 L1 ( L2 ( ( Ln ( E )) ))  L1 ( E )
4. Selections can be combined with Cartesian products and theta joins.
a. (E1 X E2) = E1  E2

b. 1(E1 2 E 2) = E 1 1 2 E2

12/10/2024
Query processing & 3-40
optimization
Cont…
5. Theta-join operations (and natural joins) are commutative.
E1  E2 = E2  E1
6. (a) Natural join operations are associative:
(E1 E2) E3 = E1 (E2 E3)
(b) Theta joins are associative in the following manner:

(E1 1 E2) 2 3 E3 = E 1 1 3 (E2 2 E3)

where 2 involves attributes from only E2 and E3.

12/10/2024
Query processing & 3-41
optimization
Cont…
7. The selection operation distributes over the theta join operation under
the following two conditions:
(a) When all the attributes in 0 involve only the attributes of one

of the expressions (E1) being joined.

0E1  E2) = (0(E1))  E2

(b) When  1 involves only the attributes of E1 and 2 involves

only the attributes of E2.

1 E1  E2) = (1(E1))  ( (E2))

12/10/2024
Query processing & 3-42
optimization
Cont…
8. The projection operation distributes over the theta join operation as
follows:
(a) if  involves only attributes from L1  L2:
ÕL1 ÈL2 ( E1 q E2 ) = (Õ L1 ( E1 )) q (Õ L2 ( E2 ))

(b) Consider a join E1  E2.

 Let L1 and L2 be sets of attributes from E1 and E2, respectively.
 Let L3 be attributes of E1 that are involved in join condition , but
are not in L1  L2, and
 let L4 be attributes of E2 that are involved in join condition , but
are not in L1  L2.

Õ L ÈL ( E1
1 2 q E2 ) = Õ L ÈL ((Õ L ÈL ( E1 ))
1 2 1 3 q (Õ L ÈL ( E2 )))
2 4

12/10/2024
Query processing & 3-43
optimization
Cont…
9. The set operations union and intersection are commutative
E1  E2 = E2  E1
E1  E2 = E2  E1
 (set difference is not commutative).
10. Set union and intersection are associative.
(E1  E2)  E3 = E1  (E2  E3)
(E1  E2)  E3 = E1  (E2  E3)
11. The selection operation distributes over ,  and –.
 (E1 – E2) =  (E1) – (E2)
and similarly for  and  in place of –
Also:  (E 1 – E2) = (E1) – E2
and similarly for  in place of –, but not for 
12. The projection operation distributes over union
L(E1  E2) = (L(E1))  (L(E2))
12/10/2024
Query processing & 3-44
optimization
Algebraic Laws
 Commutative and Associative Laws
RUS=SUR R U (S U T) = (R U S) U T
R∩S=S∩R R ∩ (S ∩ T) = (R ∩ S) ∩ T
 Laws involving selection:
 s C AND C’(R) = s C(s C’(R)) = s C(R) ∩ s C’(R)
 s C OR C’(R) = s C(R) U s C’(R)
 S) = s C (R) 
s C (R  S
• When C involves only attributes of R
 s C (R – S) = s C (R) – S
 s C (R U S) = s C (R) U s C (S)
 s C (R ∩ S) = s C (R) ∩ S

12/10/2024
Query processing & 3-45
optimization
Initial Logical Plan
B,D
Select B,D
From R,S R.A = “c” Λ R.C = S.C
Where R.A = “c” 
R.C=S.C
X
R S

Relational Algebra: B,D [ sR.A=“c” R.C = S.C (RXS)]

12/10/2024
Query processing & 3-46
optimization
Apply Rewrite Rule (1)
B,D B,D
R.C = S.C
R.A = “c” Λ R.C = S.C
R.A = “c”
X
X
R S
R S

B,D [ sR.C=S.C [R.A=“c”(R X S)]]

12/10/2024
Query processing & 3-47
optimization
Apply Rewrite Rule (2)
B,D B,D
R.C = S.C R.C = S.C
R.A = “c” X
R.A = “c” S
X
R S R
B,D [ sR.C=S.C [R.A=“c”(R)] X S]

12/10/2024
Query processing & 3-48
optimization
Apply Rewrite Rule (3)
B,D
B,D
R.C = S.C
Natural join

X R.A = “c” S
R.A = “c” S
R
R
B,D [[R.A=“c”(R)] S]

12/10/2024
Query processing & 3-49
optimization
• How do we execute this query?

Select B,D
From R,S
Where R.A = “c”  S.E = 2 
R.C=S.C

- Do Cartesian product
- Select tuples
One idea - Do projection

12/10/2024
Query processing & 3-50
optimization
R A B C S C D E
a 1 10 10 x 2
b 1 20 20 y 2
c 2 10 30 z 2
d 2 35 40 x 1
e 3 45 50 y 3
Select B,D
From R,S Answer B D
Where R.A = “c”  2 x
S.E = 2  R.C=S.C
12/10/2024
Query processing & 3-51
optimization
An Example (cont.)
 Plan 1
 Cross product of R & S
 Select tuples using WHERE conditions
 Project on B & D

 Algebra expression
B,D

R.A=‘c’ S.E=2 R.C=S.C

B,D(R.A=‘c’ S.E=2 R.C=S.C (R S))


R S
12/10/2024
Query processing & 52
optimization
RXS R.A R.B R.C S.C S.D S.E
Select B,D
a 1 10 10 x 2
From R,S
Where R.A = “c”  a 1 10 20 y 2
S.E = 2  .
R.C=S.C
.
Found! c 2 10 10 x 2
Got one... .
.

12/10/2024
Query processing & 3-53
optimization
An Example (cont.)
 Plan 2
 Select R tuples with R.A=“c”
 Select S tuples with S.E=2
 Natural join
 Project B & D

 Algebra expression B,D

B,D( R.A=“c” (R) S.E=2 (S))

R.A=‘c’ S.E=2

R S
12/10/2024
Query processing & 54
optimization
Relational Algebra Primer

Select: sR.A=“c” R.C=10

Project: B,D
Cartesian Product: R X S
Natural Join: R S

12/10/2024
Query processing & 3-55
optimization
Another idea:

Plan II B,D

natural join
sR.A = “c” sS.E = 2

R(A,B,C) S(C,D,E)
Select B,D
From R,S
Where R.A = “c” 
S.E = 2  R.C=S.C
12/10/2024
Query processing & 3-56
optimization
Measures of Query Cost
 Cost is generally measured as total elapsed time for
answering query
 Many factors contribute to time cost
• disk accesses, CPU, or even network communication
 Typically disk access is the predominant cost, and is
also relatively easy to estimate. Measured by taking
into account
 Number of seeks * average-seek-cost
 Number of blocks read * average-block-read-cost
 Number of blocks written* average-block-write-cost
• The cost to write a block is greater than the cost to read a
block

12/10/2024
Query processing & 3-57
optimization
Measures of Query Cost
(Cont.)
 For simplicity we just use the number of block transfers from
disk and the number of seeks as the cost measures
 tT – time to transfer one block
 tS – time for one seek
 Cost for b block transfers plus S seeks
b * tT + S * t S
 We do not include cost to writing output to disk in the cost
formulae
 We ignore CPU costs for simplicity, as they tend to be much
lower
 Real systems do take CPU cost into account, but they are clearly less
significant

12/10/2024
Query processing & 3-58
optimization
Algorithms for Selection
Operation
 File scan – search algorithms that locate and retrieve
records that fulfill a selection condition.
 Algorithm A1 (linear search). Scan each file block and
test all records to see whether they satisfy the selection
condition.
 Cost estimate = b block transfers + 1 seek
r

 br denotes number of blocks containing records

from relation r
 Avg.cost = (br /2) block transfers + 1 seek
 This linear search can be always applied, regardless of:
selection condition or ordered

12/10/2024
Query processing & 3-59
optimization
Algorithms for Selection (Cont.)
 A2 (binary search). Applicable only if the selection is an

equality comparison on the attribute on which file is

ordered.
 Assumes that the blocks of a relation are stored contiguously
 Cost estimate (number of disk blocks to be scanned):
 cost of locating the first tuple by a binary search on the blocks

 log2(br) * (tT + tS)

12/10/2024
Query processing & 3-60
optimization
Selections Using Indices
 Index scan – search algorithms that use an index

 selection condition must be on search-key of index.

 A3 (primary index on equality on attribute). Retrieve a

single record that satisfies the corresponding equality

condition
 Cost = (hi + 1) * (tT + tS) where hi denotes the height of the index(tree)

12/10/2024
Query processing & 3-61
optimization
Database Index
 Data is stored in the form of records. Every records has a key
field, which helps it to be recognized uniquely
 Search Key - attribute to set of attributes used to look up records
in a file.
 An index file consists of records (called index entries) of the
form search-key pointer

 Index files are typically much smaller than the original file
 Two basic kinds of indices:
 Ordered indices: search keys are stored
 Hash indices: search keys are distributed uniformly across
“buckets” using a “hash function”.

12/10/2024
Query processing & 3-62
optimization
Index Evaluation Metrics
 Access types supported efficiently.
 Equality searches – records with a specified value in
an attribute.
 Range searches – records with an attribute value falling
within a specified range
 Access time-time to find and use a files
 Insertion time- time to push new record
 Deletion time-time to delete from record
 Space overhead- how much extra byte need for the
index itself.

12/10/2024
Query processing & 3-63
optimization
Ordered Indices
 In an ordered index, index entries are stored sorted on
the search key value
 Eg.Author catalog in library
 Primary index: in a sequentially ordered file, the
index whose search key specifies the sequential order
of the actual file.
 Also called clustering index
 Secondary index: an index whose search key specifies
an order different from the sequential order of the file.
Also called
non-clustering
12/10/2024
index.
Query processing & 3-64
optimization
Dense Index Files
 Dense index — Index record appears for every search-key value in
the file. 0r every entry for possible search key values. Faster but it
requires more space to store index itself.
 E.g. index on ID

12/10/2024
Query processing & 3-65
optimization
Dense Index Files (Cont.)
Dense index on dept_name, with instructor file sorted on
dept_name

Don’t have a pointer to every records but one which has for
search key
12/10/2024
Query processing & 3-66
optimization
Sparse Index Files
 Sparse Index: contains index records for only some search-
key values
 To locate a record with search-key value K :
 Find index record with largest search-key value < K
 Search file sequentially starting at the record to which the
index record points
 You reach to the nearest record the follow the pointer.

12/10/2024
Query processing & 3-67
optimization
Sparse Index Files (Cont.)

 Compared to dense indices:

 Less space and less maintenance overhead for

insertions and deletions.
 Generally slower than dense index for locating records.

 Good tradeoff: sparse index with an index entry

for every block in file, corresponding to least

search-key value in the block.
12/10/2024
Query processing & 3-68
optimization
Problems with simple indexes
 if index does not fit in memory:
 Seeking the index is slow(Binary search)
 Ex 100,000 entries
 If we create desen index it will have very large index
 If create sparse index we may have 50,000 sparse
index.
 Solution: create multiple sparse index
 Lets assume

12/10/2024
Query processing & 3-69
optimization
Multilevel Index
 If primary index does not fit in memory, access becomes
expensive.
 Solution: treat primary index kept on disk as a sequential file
and construct a sparse index on it.
 outer index – a sparse index of primary index
 inner index – the primary index file
 If even outer index is too large to fit in main memory, yet
another level of index can be created, and so on.
 Indices at all levels must be updated on insertion or deletion
from the file.

12/10/2024
Query processing & 3-70
optimization
Multilevel Index (Cont.)

12/10/2024
Query processing & 3-71
optimization
B+-Tree Index Files
 All the data is stored in leaf node.

 Every leaf is at the same level

 All the leafs have pointer/links with each other

 There is threshold level(M)= max no of elements

at a node.

12/10/2024 Query processing & optimization

3-72
y o u !
Th a n k

12/10/2024
Query processing & 3-73
optimization

Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
129 pages
PostgreSQL DBA Guide
No ratings yet
PostgreSQL DBA Guide
105 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
BCA-VIII Multimedia Model Questions PDF
100% (2)
BCA-VIII Multimedia Model Questions PDF
2 pages
Documentation
No ratings yet
Documentation
3,516 pages
GIS Technology and Its Applications
100% (5)
GIS Technology and Its Applications
24 pages
c2763750 - Application Development Guide
No ratings yet
c2763750 - Application Development Guide
1,050 pages
Chapter 2 - Query Processing and Optimization
100% (1)
Chapter 2 - Query Processing and Optimization
28 pages
Chapter 1 - Query Processing and Optimization
No ratings yet
Chapter 1 - Query Processing and Optimization
62 pages
Chapter 1 Query Processing
No ratings yet
Chapter 1 Query Processing
58 pages
Advanced Database Systems Chapter 2
100% (1)
Advanced Database Systems Chapter 2
16 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
63 pages
SAD Individual Assignment
100% (1)
SAD Individual Assignment
18 pages
SAP How-To Guide: Extend The MDG Business Partner - Overview
No ratings yet
SAP How-To Guide: Extend The MDG Business Partner - Overview
37 pages
Chapter 6 - Query Processing and Optimization Algorithm
No ratings yet
Chapter 6 - Query Processing and Optimization Algorithm
27 pages
Ch-1 Introduction Design and Arch
No ratings yet
Ch-1 Introduction Design and Arch
56 pages
Best Practices of OceanStor Dorado & OceanStor For VMware in NAS Scenarios
No ratings yet
Best Practices of OceanStor Dorado & OceanStor For VMware in NAS Scenarios
46 pages
Design of The Smart Glove To System The Visually Impaired
No ratings yet
Design of The Smart Glove To System The Visually Impaired
63 pages
Query Processing
No ratings yet
Query Processing
28 pages
TCI Reference Architecture v2.0
100% (1)
TCI Reference Architecture v2.0
1 page
CO3-Notes-Query Processing and Optimization
No ratings yet
CO3-Notes-Query Processing and Optimization
5 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Chat GPT Generated Cs 105 Questions
No ratings yet
Chat GPT Generated Cs 105 Questions
25 pages
Chapter-1 - Introduction To Data Mining
No ratings yet
Chapter-1 - Introduction To Data Mining
10 pages
Query Optimization
No ratings yet
Query Optimization
103 pages
Unit-5 Query Processing and Optimization
No ratings yet
Unit-5 Query Processing and Optimization
40 pages
WP - Chapter Four JS
No ratings yet
WP - Chapter Four JS
121 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
CH 02
No ratings yet
CH 02
127 pages
CH - 1 Query Process SW
No ratings yet
CH - 1 Query Process SW
43 pages
Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
108 pages
National University of Computer & Emerging Sciences, FAST, Islamabad Computer Science Department
No ratings yet
National University of Computer & Emerging Sciences, FAST, Islamabad Computer Science Department
5 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Chapter Object Oriented Databases and Object Persistence
No ratings yet
Chapter Object Oriented Databases and Object Persistence
51 pages
Advanced Database System Chapter Three Query Processing and Optimization
No ratings yet
Advanced Database System Chapter Three Query Processing and Optimization
94 pages
Reading Sample Sappress 978 SAP CATS PDF
No ratings yet
Reading Sample Sappress 978 SAP CATS PDF
44 pages
ADBMS Chapter 1
No ratings yet
ADBMS Chapter 1
47 pages
Adb ch2
No ratings yet
Adb ch2
72 pages
Chapter 5 Stack and Queue - Teacher
No ratings yet
Chapter 5 Stack and Queue - Teacher
70 pages
Query Optimization
No ratings yet
Query Optimization
60 pages
Ad DB Chapter 2
No ratings yet
Ad DB Chapter 2
35 pages
Chapter 2
No ratings yet
Chapter 2
47 pages
29-Query Optimization-04-10-2024
No ratings yet
29-Query Optimization-04-10-2024
35 pages
Generation and Deployment of Honeytokens in Relational Databases For Cyber Deception
No ratings yet
Generation and Deployment of Honeytokens in Relational Databases For Cyber Deception
25 pages
04 Advanced Database System Chap 02 (RVUNC)
No ratings yet
04 Advanced Database System Chap 02 (RVUNC)
50 pages
12-Query Optimization-26-02-2025
No ratings yet
12-Query Optimization-26-02-2025
28 pages
CHAPTER 6 - Teacher
No ratings yet
CHAPTER 6 - Teacher
32 pages
Chapter 2 Query Processing and Optimization
No ratings yet
Chapter 2 Query Processing and Optimization
58 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Advaced DB U1
No ratings yet
Advaced DB U1
48 pages
Chapter 2 Query Processing and Optimization
No ratings yet
Chapter 2 Query Processing and Optimization
45 pages
Module - 4
No ratings yet
Module - 4
60 pages
ADBS - Chapter Two
No ratings yet
ADBS - Chapter Two
41 pages
Itm661 Lecture03 Part2 2015
No ratings yet
Itm661 Lecture03 Part2 2015
47 pages
FORMS Interview Questions
No ratings yet
FORMS Interview Questions
15 pages
ADB Chapter 2
No ratings yet
ADB Chapter 2
40 pages
2 Algorithms For Query Processing Optimization
No ratings yet
2 Algorithms For Query Processing Optimization
46 pages
ADBChapter 1
No ratings yet
ADBChapter 1
32 pages
Chapter 2 Query Optimization
No ratings yet
Chapter 2 Query Optimization
31 pages
Coa Chapter-1, A
No ratings yet
Coa Chapter-1, A
29 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
31 pages
Ch-2 Query Processing and Optimization
No ratings yet
Ch-2 Query Processing and Optimization
26 pages
Uds24201j Unit III
No ratings yet
Uds24201j Unit III
34 pages
CH - 2 Query Process
No ratings yet
CH - 2 Query Process
44 pages
Agility Blue - Legal Project Mangement Navigating Trends and Overcoming Challenges in The Legal Landscape - FINAL
No ratings yet
Agility Blue - Legal Project Mangement Navigating Trends and Overcoming Challenges in The Legal Landscape - FINAL
9 pages
Coa Chapter-2
No ratings yet
Coa Chapter-2
22 pages
Nicmar: National Institute of Construction - . Management and Research
No ratings yet
Nicmar: National Institute of Construction - . Management and Research
23 pages
Chapter Two Query Processing
No ratings yet
Chapter Two Query Processing
60 pages
Online Clinic Reservation
No ratings yet
Online Clinic Reservation
65 pages
Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
40 pages
DE Module5 QueryOptimization
No ratings yet
DE Module5 QueryOptimization
11 pages
Chapter 2 Adb
No ratings yet
Chapter 2 Adb
21 pages
AMSAL
No ratings yet
AMSAL
58 pages
4.1 Java Database Programming
No ratings yet
4.1 Java Database Programming
19 pages
ATP Server - Installation and Sizing
No ratings yet
ATP Server - Installation and Sizing
7 pages
Student Record Management System Project
No ratings yet
Student Record Management System Project
6 pages
What Is Query Processing?
No ratings yet
What Is Query Processing?
9 pages
Lecture 4 Query Processing
No ratings yet
Lecture 4 Query Processing
18 pages
Codewright Manual
0% (1)
Codewright Manual
342 pages
Acs 4 0 Win User Guide
No ratings yet
Acs 4 0 Win User Guide
637 pages
303: Relational Database Management System Question Bank Iqra Bca College-Dahegam, Bharuch
No ratings yet
303: Relational Database Management System Question Bank Iqra Bca College-Dahegam, Bharuch
5 pages
Data in Machine Learning
No ratings yet
Data in Machine Learning
7 pages
Query Optimization
No ratings yet
Query Optimization
5 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
Query Processing Optimization
No ratings yet
Query Processing Optimization
38 pages
2 Chapter 3 Query Optimization
No ratings yet
2 Chapter 3 Query Optimization
29 pages
Query Processing
No ratings yet
Query Processing
4 pages
Using Microservices For Legacy Software
No ratings yet
Using Microservices For Legacy Software
6 pages
Muhammad Hassan - CV - ATSG
No ratings yet
Muhammad Hassan - CV - ATSG
3 pages
Assignment Activity Unit 1: CS 1111-01 - AY2025-T2
No ratings yet
Assignment Activity Unit 1: CS 1111-01 - AY2025-T2
2 pages
JATIN's Resume
No ratings yet
JATIN's Resume
2 pages
Chapter One1
No ratings yet
Chapter One1
21 pages
Dr. Ambedkar Government Arts College (Autonomous) : B.Sc. Computer Science
No ratings yet
Dr. Ambedkar Government Arts College (Autonomous) : B.Sc. Computer Science
11 pages
QUERY Processing and Relational Algebra
No ratings yet
QUERY Processing and Relational Algebra
27 pages
Answer Asm 652
No ratings yet
Answer Asm 652
5 pages

ch2 PDF

Uploaded by

ch2 PDF

Uploaded by

Chapter 2

Query Processing and

Runtime Database Code Execution

 The parser checks the query syntax to determine

usually as a tree data structure called a query tree.

retrieving the results of the query from the database

strategies, and the process of choosing a suitable one for

 Relational algebra expression

There are three phases that a query passes through during

 Parser checks syntax,

 Cost is estimated using statistical information from the

 How to measure query costs

 We study how to optimize queries, that is, how to find an

Relational algebra semantics says remove duplicates

Ex: length100 (Movie) vertical restriction

 Resulting schema has attributes from R, either R or S

(i.e., joining attribute(s)), and STuples that fail to pair

Select <attribute list>

Select <attribute list>

sR.A=“c” S.E=2  R.C=S.C

Possible SQL Query:

Possible Relational Algebra Query:

Possible Relational Algebra Query:

Tree Representation of Relational Algebra

 Selection: find a subset of rows in a table

 How to estimate cost of individual operation?

 How does available buffer affect the cost?

 How to evaluate a relational algebraic expression?

sR.A=“c” S.E=2  R.C=S.C

OR: B,D [ sR.A=“c” S.E=2  R.C = S.C (RXS)]

 Algebra expression B,D

B,D( R.A=“c” (R) S.E=2 (S))

 Steps in converting a query tree during heuristic optimization.

A. Initial (canonical) query tree for SQL query Q.

 This is called heuristic optimization

 As the preceding example demonstrates, a query tree can be transformed step by

 There are many rules for transforming relational algebra operations

into equivalent ones.

optimization, without proving them:

2. Selection operations are commutative.

3. Only the last in a sequence of projection a operations is needed, the others

(E1 1 E2) 2 3 E3 = E 1 1 3 (E2 2 E3)

where 2 involves attributes from only E2 and E3.

of the expressions (E1) being joined.

0E1  E2) = (0(E1))  E2

(b) When  1 involves only the attributes of E1 and 2 involves

only the attributes of E2.

1 E1  E2) = (1(E1))  ( (E2))

(b) Consider a join E1  E2.

Relational Algebra: B,D [ sR.A=“c” R.C = S.C (RXS)]

B,D [ sR.C=S.C [R.A=“c”(R X S)]]

R.A=‘c’ S.E=2 R.C=S.C

 Algebra expression B,D

B,D( R.A=“c” (R) S.E=2 (S))

Select: sR.A=“c” R.C=10

 br denotes number of blocks containing records

equality comparison on the attribute on which file is

 log2(br) * (tT + tS)

 selection condition must be on search-key of index.

 A3 (primary index on equality on attribute). Retrieve a

single record that satisfies the corresponding equality

 Compared to dense indices:

 Less space and less maintenance overhead for

 Good tradeoff: sparse index with an index entry

for every block in file, corresponding to least

 Every leaf is at the same level

 All the leafs have pointer/links with each other

 There is threshold level(M)= max no of elements

12/10/2024 Query processing & optimization

You might also like