0% found this document useful (0 votes)

16 views35 pages

1b Query Optimization Sil 7ed Ch16

Uploaded by

gpreparation9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views35 pages

1b Query Optimization Sil 7ed Ch16

Uploaded by

gpreparation9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 35

Data Management for Big Data

2019-2020 (spring semester)

Dario Della Monica

Chapter 16: Query Optimization

These slides are a modified version of the slides provided with the book:
(however, chapter numeration refers to 7 th Ed.)

Database System Concepts, 6th Ed.

©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

The original version of the slides is available at: https://www.db-book.com/

Chapter 16: Query Optimization
 Introduction
 Generating Equivalent Expressions
 Equivalence rules
 How to generate (all) equivalent expressions
 Estimating Statistics of Expression Results
 The Catalog
 Size estimation
 Selection
 Join
 Other operations (projection, aggregation, set operations, outer join)
 Estimation of number of distinct values
 Choice of Evaluation Plans
 Dynamic Programming for Choosing Evaluation Plans

Database System Concepts - 7th Edition 16.2 ©Silberschatz, Korth and Sudarshan
Introduction
 Query optimization: finding the “best” query execution plan (QEP) among the
many possible ones
 User is not expected to write queries efficiently (DBMS optimizer takes care of that)
 Alternative ways to execute a given query – 2 levels
 Equivalent relational algebra expressions
 Different implementation choices for each relational algebra operation
 Algorithms, indices, coordination between successive operations, …

INSTR(i_id, name, dept_name, ...) The name of all instructors in the department of Music
COURSE(c_id, title, ...) together with the titles of all courses they teach
TEACHES(i_id, c_id, ...)

SELECT I.name, C.title

FROM INSTR I, COURSE C, TEACHES T
WHERE I.i_id = T.i_id
AND T.c_id = C.c_id
AND dept_name=“Music”

( ( INSTR (TEACHES COURSE ))) ( ( INSTR) (TEACHES COURSE ))

Database System Concepts - 7 Edition
th
16.3 ©Silberschatz, Korth and Sudarshan
Introduction (Cont.)
 A query evaluation plan (QEP) defines exactly what algorithm is used for each
operation, and how the execution of the operations is coordinated

 Find out how to view query execution plans on your favorite database

Database System Concepts - 7th Edition 16.4 ©Silberschatz, Korth and Sudarshan
Introduction (Cont.)
 Cost difference between query evaluation plans can be enormous
 E.g. seconds vs. days in some cases
 It is worth spending time in finding “best” QEP
 Steps in cost-based query optimization
1. Generate logically equivalent expressions using equivalence
rules
2. Annotate in all possible ways resulting expressions to get
alternative QEP
3. Evaluate/estimate the cost (execution time) of each QEP
4. Choose the cheapest QEP based on estimated cost
 Estimation of QEP cost based on:
 Statistical information about relations (stored in the Catalog)
 number of tuples, number of distinct values for an attribute
 Statistics estimation for intermediate results
 to compute cost of complex expressions
 Cost formulae for algorithms, computed using statistics
Database System Concepts - 7th Edition 16.5 ©Silberschatz, Korth and Sudarshan
Generating Equivalent Expressions
 Equivalence rules
 How to generate (all) equivalent expressions

These slides are a modified version of the slides provided with the book:
(however, chapter numeration refers to 7 th Ed.)

Database System Concepts, 6th Ed.

©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

The original version of the slides is available at: https://www.db-book.com/

Transformation of Relational Expressions

 Two relational algebra expressions are said to be equivalent if the two

expressions generate the same set of tuples on every legal database
instance
 Note: order of tuples is irrelevant (and also order of attributes)
 We don’t care if they generate different results on databases that
violate integrity constraints (e.g., uniqueness of keys)
 In SQL, inputs and outputs are multisets of tuples
 Two expressions in the multiset version of the relational algebra are
said to be equivalent if the two expressions generate the same multiset
of tuples on every legal database instance
 We focus on relational algebra and treat relations as sets
 An equivalence rule states that expressions of two forms are equivalent
 One can replace an expression of first form by one of the second form,
or vice versa

Database System Concepts - 7th Edition 16.7 ©Silberschatz, Korth and Sudarshan
Equivalence Rules
1. Conjunctive selection operations can be deconstructed into a
sequence of individual selections.

2. Selection operations are commutative.

3. Only the last in a sequence of projection operations is

needed, the others can be omitted
 L1 ( L2 ( ( Ln ( E )) ))  L1 ( E )
where L1  L2    Ln
4. Selections can be combined with Cartesian products and
theta joins.
a. (E1 x E2) = E1  E2

b.  (E1 2 E2) = E1 1 2 E2

Database System Concepts - 7th Edition 16.8 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
5. Theta-join (and thus natural joins) operations are commutative.
E1  E2 = E2  E1

(but the order is important for efficiency)

6. (a) Natural join operations are associative:

(E1 E2) E3 = E1 (E2 E3 )
(again, the order is important for efficiency)

(b) Theta joins are associative in the following manner:

(E1 1 E2) 23 E3 = E1 1 3 (E2 2 E3)
where 1 involves attributes from only E1 and
E2
and 2 involves attributes from only E2 and
E3

Database System Concepts - 7th Edition 16.9 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)

7. (a) Selection distribute over theta join in the following manner:

 (E1 ⋈θ E2) = ( (E1)) ⋈θ E2
1 1
where 1 involves attributes from only E1

(b) Complex selection distribute over theta join in the following manner:
 ∧ (E1 ⋈θ E2) = ( (E1)) ⋈θ ( (E2))
1 2 1 2

where 1 involves attributes from only E1

and 2 involves attributes from only E2

More equivalences at Ch. 16.2 of the book ⋆

⋆
Silberschatz, Korth, and Sudarshan, Database System Concepts, 7° ed.

Database System Concepts - 7th Edition 16.10 ©Silberschatz, Korth and Sudarshan
Pictorial Depiction of Equivalence Rules

Database System Concepts - 7th Edition 16.11 ©Silberschatz, Korth and Sudarshan
Exercise
 Disprove the equivalence

(R S) T = R (S T)

Definition (left outer join): the result of a left outer join T = R S is a super-set of the
result of the join T’ = R S in that all tuples in T’ appear in T. In addition, T preserve
those tuples that are lost in the join, by creating tuples in T that are filled with null
values

STUD stud_id name surname

1 gino bianchi STUD TAKES
2 filippo neri stud_id name surname course grade
3 mario rossi 1 gino bianchi Math 30
2 filippo neri DB 22
TAKES stud_id course grade
2 filippo neri Logic 30
1 Math 30
3 mario rossi null null
2 DB 22
2 Logic 30
TAKES STUD ???

Database System Concepts - 7th Edition 16.12 ©Silberschatz, Korth and Sudarshan
Solution
 Disprove the equivalence (R S) T = R (S T)

R S T
A AR A AS A AT
1 1 2 1 1 1

R S S T
A AR AS A AS AT
1 1 null 2 1 null

(R S) T R (S T)
A AR AS AT A AR AS AT
1 1 null 1 1 1 null null

Database System Concepts - 7th Edition 16.13 ©Silberschatz, Korth and Sudarshan
Equivalence derivability and minimality

 Some equivalence can be derived from others

 example: 2 can be obtained from 1 (exploiting commutativity of
conjunction)
7b can be obtained from 1 and 7a

 Optimizers use minimal sets of equivalence rules

Database System Concepts - 7th Edition 16.14 ©Silberschatz, Korth and Sudarshan
Enumeration of Equivalent Expressions
 Query optimizers use equivalence rules to systematically generate
expressions equivalent to the given one
 Can generate all equivalent expressions as follows:
 Repeat (starting from the set containing only the given expression)
 apply all applicable equivalence rules on every sub-expression of
every equivalent expression found so far
 add newly generated expressions to the set of equivalent
expressions
Until no new equivalent expressions are generated
 The above approach is very expensive in space and time
 Space: efficient expression-representation techniques
 1 copy is stored for shared sub-expressions
 Time: partial generation
 Dynamic programming
 Greedy techniques (select best choices at each step)
E1 E2
 Heuristics, e.g., single-relation operations
(selections, projections) are pushed inside (performed earlier)

Database System Concepts - 7th Edition 16.15 ©Silberschatz, Korth and Sudarshan
Estimating Statistics of Expression
Results
 The Catalog
 Size estimation
 Selection
 Join
 Other operations (projection, aggregation, set operations, outer join)
 Estimation of number of distinct values

These slides are a modified version of the slides provided with the book:
(however, chapter numeration refers to 7 th Ed.)

Database System Concepts, 6th Ed.

©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

The original version of the slides is available at: https://www.db-book.com/

Cost Estimation
 Cost of each operator computed as described in Chapter 15 ⋆
 Need statistics of input relations
 E.g. number of tuples, sizes of tuples
 Statistics are collected in the Catalog
 Inputs can be results of sub-expressions
 Need to estimate statistics of expression results
 Estimation of size of intermediate results
 # of tuple in input to successive operations
 Estimation of number of distinct values in intermediate results
 selectivity rate of successive selection operations

 Statistics are not totally accurate

 Information in the catalog might be not always up-to-date (delay)
 A precise estimate for intermediate results might be impossible to compute

⋆
Silberschatz, Korth, and Sudarshan, Database System Concepts, 7° ed.

Database System Concepts - 7th Edition 16.17 ©Silberschatz, Korth and Sudarshan
Statistical Information for Cost Estimation
 Statistics information is maintained in the Catalog
 The catalog is itself stored in the database as relation(s)
 It contains:
 nr: number of tuples in a relation r
 br: number of blocks containing tuples of r
 lr: size of a tuple of r (in bytes)
 fr: blocking factor of r – i.e., the number of tuples of r that fit into one block
 V(A, r): number of distinct values that appear in r for set of attributes A
 V(A, r) = A(r) – if A is a key, then V(A,r) = nr
 min(A,r): smallest value appearing in relation r for set of attribute A;
 max(A,r): largest value appearing in relation r for set of attribute A;
 statistics about indices (height of B+-trees, number of blocks for leaves, …)
 We assume tuples of r are stored together physically in a file; then: br = ⌈ nr / fr ⌉
 Information not always up-to-date
 Catalog is not updated to every DB change (done during periods of light system load)

Database System Concepts - 7th Edition 16.18 ©Silberschatz, Korth and Sudarshan
Histograms
 Histogram on attribute age of relation person

 For each range

 Number of records (tuples) with value in the range
 Also, number of distinct values in the range
 Without histogram information, uniform distribution is assumed
 Little space occupation
 Histograms for many attributes on many relations can be stored

Database System Concepts - 7th Edition 16.19 ©Silberschatz, Korth and Sudarshan
Selection Size Estimation
 # of records that will satisfy the selection predicate (aka selection condition)
 A=v(r ) (we are assuming that v actually is present in A)
 nr / V(A,r) (no histogram, uniform distribution)
 1 if A is key
 A  v(r ) (case A  V(r) is symmetric)
 0 if v < min(A,r)
 nr if v >= max(A,r)
v  min( A, r )
nr *
 max( A, r )  min( A, rotherwise
) (no histogram, uniform distribution)

 In absence of statistical information or when v is unknown at time of cost estimation (e.g.,

v is computed at run-time by the application using the DB), the we assume
 nr / 2

 If histograms are available, we can do more precise estimates

 use values for restricted ranges instead of nr , V(A,r), min(A, r), max(A,r)

Database System Concepts - 7th Edition 16.20 ©Silberschatz, Korth and Sudarshan
Complex Selection Size Estimation
 Conjunction E = θ1 ∧ θ2 ∧ … ∧ θn (r )

 we compute si = size selection for θi (i = 1,…, n)

 selectivity rate (SR) of θi (r): SR(θi (r) ) = si / nr (i = 1,…, n)

 SR(E) = Πi (SR(θi (r))) = s1 / nr * … * sn/ nr Πi is multiplication with i = 1,…,n

s1*s2*...*sn
nr*
 # of record for E = nr * SR(E) = (nr )n

 Disjunction E = θ (r ) = ¬(¬θ (r )
1 ∨ θ2 ∨ … ∨ θn 1 ∧ ¬θ2 ∧ … ∧ ¬θn)

 SR(E) = 1 - SR(¬θ (r ))
1 ∧ ¬θ2 ∧ … ∧ ¬θn

 SR(¬θ (r )) = (1 - s1 / nr ) * … * (1 - sn/ nr )
1 ∧ ¬θ2 ∧ … ∧ ¬θn
 s s s 
nr* 1 - ( 1  1 )*( 1  2 )*...*( 1  n )
 nr nr nr 
 # of record for E = nr * SR(E) =
 Negation E = ¬θ (r)
 # of record for E = nr - # of record for  θ (r)

Database System Concepts - 7th Edition 16.21 ©Silberschatz, Korth and Sudarshan
Join Size Estimation
 # of records that will be included in the result
 (cartesian product) r x s: # of records = nr * ns
 (natural join on attribute A) r ⋈ s:
 for each tuple tr of r there are in average ns / V(A,s) many tuples of s selected
 thus, # of records = nr * ns / V(A,s)
 by switching the role of r and s we get # of records = nr * ns / V(A,r)
 lowest is more accurate estimation # of records = nr * ns / max{ V(A,r), V(A,s) }
 histograms can be used for more accurate estimations
 histograms must be on join attributes, for both relations, and with same ranges
 use values for restricted ranges instead of nr , ns , V(A,r), V(A,s) and then sum estimations
for each range
 if A is key for r, then # of records <= ns (and vice versa)
 in addition, if A is not null in s, then # of records = ns (and vice
versa)
 (theta join) r ⋈θ s
 r ⋈θ s =  θ ( r x s) use formulas for cartesian product and
selection

Database System Concepts - 7th Edition 16.22 ©Silberschatz, Korth and Sudarshan
Size Estimation for Other Operations
 projection (no duplications): # of records = V(A,r)
 aggregation GγF (r) # of records = V(G,r)
 set operations
 between selections on same relation use formulas for selection
 es.: θ1(r) ∪ θ2(r) = θ1 ∨ θ2 (r)
 r∪s # of records = nr + ns
 r∩s # of records = min { nr , ns }
 r–s # of records = nr
 outer join
 left outer join # of records = # of records for inner join + n r
 right outer join # of records = # of records for inner join + n s
 full outer join # of records = # of records for inner join + n r
+ ns

Database System Concepts - 7th Edition 16.23 ©Silberschatz, Korth and Sudarshan
Estimation for Number of Distinct
Values
 # distinct values in the result for expression E and attribute (or set of attributes) A: V(A,E)
 Selection E = θ (r)
 V(A, E) is a specific value for some conditions – e.g., A=3 or 3 < A <= 6
 condition A < v (or A > v, A >= v, … ) V(A,E) = V(A,r) * selectivity rate of the
selection
 otherwise V(A,E) = min { nE , V(A,r) }
 Join E = r⋈s
 A only contains attributes from r V(A,E) = min { nE , V(A,r) }
 A only contains attributes from s V(A,E) = min { nE , V(A,s) }
 A contains attributes A1 from r and attributes A2 from s
V(A,E) = min { nE , V(A1, r) * V(A2 – A1, s) ,
V(A2, s) * V(A1 – A2, r) }

Database System Concepts - 7th Edition 16.24 ©Silberschatz, Korth and Sudarshan
Choice of Evaluation Plans
 Dynamic Programming for Choosing Evaluation Plans

These slides are a modified version of the slides provided with the book:
(however, chapter numeration refers to 7 th Ed.)

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

Choice of Evaluation Plans
 Must consider the interaction of evaluation techniques when choosing
evaluation plans
 choosing the cheapest algorithm for each operation independently
may not yield best overall algorithm. E.g.
 merge-join may be costlier than hash-join, but may provide a
sorted output which reduces the cost for an outer level
aggregation
 nested-loop join may provide opportunity for pipelining
 Practical query optimizers incorporate elements of the following two
broad approaches:
1. Search all the plans and choose the best plan in a cost-based
fashion
2. Uses heuristics to choose a plan

Database System Concepts - 7th Edition 16.26 ©Silberschatz, Korth and Sudarshan
Cost-Based Optimization
 A big part of a cost-based optimizer (based on equivalence rules) is
choosing the “best” order for join operations
 Consider finding the best join-order for r1 ⋈ r2 ⋈ . . . ⋈ rn.
 There are (2(n – 1))!/(n – 1)! different join orders for above expression.
With n = 7, the number is 665280, with n = 10, the number is greater
than 17.6 billion!
 No need to generate all the join orders. Exploiting some monotonicity
(optimal substructure property), the least-cost join order for any subset
of {r1, r2, . . ., rn} is computed only once.

Database System Concepts - 7th Edition 16.27 ©Silberschatz, Korth and Sudarshan
Cost-Based Optimization: An example
 Consider finding the best join-order for r1 r2 r3 r4 r5
(2(n  1))! 8 !
 Number of possible different join orderings:  1680
(n  1)! 4!
 The least-cost join order for any subset of { r1, r2, r3, r4, r5 } is computed only once
 Assume we want to compute N123/45 : number of possible different join orderings
where r1, r2, r3 sare grouped together, e.g.,
(r1 r2 r3) r4 r5 (r2 r3 r1) (r5 r4 ) r4 (r5 (r1 (r2 r3))) …

 The naïve approach

 N123/45 = N123 * N45
4!
 N123 = 12 (N123 : # ways of arranging r1, r2, and r3)
2!
 N45 = N123 = 12 (N45 : # ways of arranging r4 and r5 wrt. block of r1, r2, and r3)
 N123/45 = 12 * 12 = 144
 Exploiting optimal substructure property:
 compute only once best ordering for r1 r2 r3 : 12 possibilities (N123)
 compute best ordering for R123 r4 r5 : 12 possibilities (N45)
 Therefore, N123/45 = 12 + 12 = 24
Database System Concepts - 7th Edition 16.28 ©Silberschatz, Korth and Sudarshan
Dynamic Programming in Optimization

 To find best join tree (equivalently, best join order) for a set of n relations:
 Consider all possible plans of the form:
S’ ⋈ (S \ S’ )
for every non-empty subset S’ of S
 Recursively compute (and store) costs of best join orders for subsets
S’ and S \ S’. Choose the cheapest of the 2n – 2 alternatives
 Base case for recursion: find best algorithm for scanning relation
 When a plan for a subset is computed, store it and reuse it when it is
required again, instead of re-computing it
 Dynamic programming

Database System Concepts - 7th Edition 16.29 ©Silberschatz, Korth and Sudarshan
Join Order Optimization Algorithm
procedure findbestplan(S)
if (bestplan[S].cost  )
return bestplan[S]
// else bestplan[S] has not been computed earlier, compute it now
if (S contains only 1 relation)
set bestplan[S].plan and bestplan[S].cost based on the best way
of accessing S /* Using selections on S and indices on S */
else for each non-empty subset S1 of S such that S1  S
P1= findbestplan(S1)
P2= findbestplan(S - S1)
A = best algorithm for joining results of P1 and P2
cost = P1.cost + P2.cost + cost of A
if cost < bestplan[S].cost
bestplan[S].cost = cost
bestplan[S].plan = “execute P1.plan; execute P2.plan;
join results of P1 and P2 using A”
return bestplan[S]

* This is the algorithm shown in the 6th edition of the textbook.

It is slightly different from the algorithm we presented during our class, especially the way
the base case is handled.

Database System Concepts - 7th Edition 16.30 ©Silberschatz, Korth and Sudarshan
Cost of Optimization
 With dynamic programming time complexity of optimization is O(3n).
 With n = 10, this number is 59000 instead of 17.6 billion!
 Space complexity is O(2n)
 Better time performance when considering only left-deep join tree O(n 2n)
Space complexity remains at O(2n) (heuristic approach)

 Cost-based optimization is expensive, but worthwhile for queries on

large datasets (typical queries have small n, generally < 10)

Database System Concepts - 7th Edition 16.31 ©Silberschatz, Korth and Sudarshan
Cost Based Optimization with Equivalence
Rules
 Physical equivalence rules equates logical operations (e.g., join) to physical
ones (i.e., implementations – e.g., nested-loop join, merge join)
 Relational algebra expression are converted into QEP with implementation details
 Efficient optimizer based on equivalence rules depends on
 A space efficient representation of expressions which avoids making
multiple copies of sub-expressions
 Efficient techniques for detecting duplicate derivations of expressions
 Dynamic programming or memoization techniques, which store the “best”
plan for a sub-expression the first time it is computed, and reuses in on
repeated optimization calls on same sub-expression
 Cost-based pruning techniques that avoid generating all plans (greedy,
heuristics)

Database System Concepts - 7th Edition 16.32 ©Silberschatz, Korth and Sudarshan
Heuristic Optimization
 Cost-based optimization is expensive, even with dynamic programming
 Systems may use heuristics to reduce the number of possibilities
choices that must be considered
 Heuristic optimization transforms the query-tree by using a set of rules
that typically (but not in all cases) improve execution performance:
 Perform selection early (reduces the number of tuples)
 Perform projection early (reduces the number of attributes)
 Perform most restrictive selection and join operations (i.e. with
smallest result size) before other similar operations
 Only consider left-deep join orders (particularly suited for pipelining
as only one input has to be pipelined, the other is a relation)

Database System Concepts - 7th Edition 16.33 ©Silberschatz, Korth and Sudarshan
Structure of Query Optimizers
 Some systems use only heuristics, others combine heuristics with partial
cost-based optimization.
 Many optimizers considers only left-deep join orders.
 Plus heuristics to push selections and projections down the query
tree
 Reduces optimization complexity and generates plans amenable to
pipelined evaluation.
 Heuristic optimization used in some versions of Oracle:
 Repeatedly pick “best” relation to join next
 it obtains and compares n plans (each starting with one relation.
In each plan, pick the best next relation for the join

These slides are a modified version of the slides provided with the book:
(however, chapter numeration refers to 7 th Ed.)

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

DP-300 Study Notes
100% (1)
DP-300 Study Notes
13 pages
CoPilot Architecture
No ratings yet
CoPilot Architecture
2 pages
Client
No ratings yet
Client
3 pages
Conversación en Ingles (Recuperado Automáticamente)
No ratings yet
Conversación en Ingles (Recuperado Automáticamente)
4 pages
Relational Algebra
No ratings yet
Relational Algebra
39 pages
Ch2 - Introduction To The Relational Model
No ratings yet
Ch2 - Introduction To The Relational Model
32 pages
Lecture 3 Relational Model
No ratings yet
Lecture 3 Relational Model
30 pages
Query Optimization
No ratings yet
Query Optimization
84 pages
Microsoft Access Help Desk (Edited)
No ratings yet
Microsoft Access Help Desk (Edited)
9 pages
Relational Algebra
No ratings yet
Relational Algebra
19 pages
Naukri KowsalyaRamamoorthy (3y 0m)
No ratings yet
Naukri KowsalyaRamamoorthy (3y 0m)
3 pages
8 - Relational Algebra - Procedural - Support
No ratings yet
8 - Relational Algebra - Procedural - Support
34 pages
CH 6
No ratings yet
CH 6
51 pages
9-12 DBMS
No ratings yet
9-12 DBMS
589 pages
Mobile Iron VSPRDBGuide 60
No ratings yet
Mobile Iron VSPRDBGuide 60
41 pages
DBMS - Relational Model (Full)
No ratings yet
DBMS - Relational Model (Full)
28 pages
Week 5 - The Relational Model and Relational Algebra
No ratings yet
Week 5 - The Relational Model and Relational Algebra
42 pages
L07-Formal Relational Query Languages
No ratings yet
L07-Formal Relational Query Languages
53 pages
CH 2
No ratings yet
CH 2
33 pages
Chapter 12, 13 - Query Processing and Optimization
No ratings yet
Chapter 12, 13 - Query Processing and Optimization
24 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
25 pages
1 1b Query Optimization Sil 7ed ch16 SPLIT
No ratings yet
1 1b Query Optimization Sil 7ed ch16 SPLIT
69 pages
Chapter 13
No ratings yet
Chapter 13
57 pages
CH 27
No ratings yet
CH 27
19 pages
CH 2
No ratings yet
CH 2
40 pages
Chap 2
No ratings yet
Chap 2
26 pages
Cloud Data Governance and Catalog Data Sheet 4152en
No ratings yet
Cloud Data Governance and Catalog Data Sheet 4152en
11 pages
Module#3
No ratings yet
Module#3
36 pages
CH 2
No ratings yet
CH 2
33 pages
Relational Algebra
No ratings yet
Relational Algebra
42 pages
Dbms
No ratings yet
Dbms
47 pages
ch4 INTERMEDIAT SQL
No ratings yet
ch4 INTERMEDIAT SQL
62 pages
CH 2
No ratings yet
CH 2
30 pages
Ch13 QueryOptimization Korth6E
No ratings yet
Ch13 QueryOptimization Korth6E
24 pages
Chapter 3: Introduction To SQL
No ratings yet
Chapter 3: Introduction To SQL
19 pages
Relagebra
No ratings yet
Relagebra
30 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
Chapter 4
No ratings yet
Chapter 4
86 pages
28-Query Processing-30-09-2024
No ratings yet
28-Query Processing-30-09-2024
17 pages
Ch2 Relational Model
No ratings yet
Ch2 Relational Model
30 pages
ccs341 Data Warehouse Lab Experiments
No ratings yet
ccs341 Data Warehouse Lab Experiments
26 pages
CSE311 IAH Slide05 Relational Algebra
No ratings yet
CSE311 IAH Slide05 Relational Algebra
15 pages
4 Chapter Four
No ratings yet
4 Chapter Four
34 pages
Ch-4 Data Mining Knowledge Representation Premitives
No ratings yet
Ch-4 Data Mining Knowledge Representation Premitives
16 pages
DBMS Unit - 7
No ratings yet
DBMS Unit - 7
34 pages
Unit 5 Discussion Forum
No ratings yet
Unit 5 Discussion Forum
2 pages
Indexing and Abstracting
No ratings yet
Indexing and Abstracting
108 pages
FB 4 Migrationguide
No ratings yet
FB 4 Migrationguide
24 pages
Mysql Assignment Cs - Class 12 Second Term Exerecise 1: Table: Member
100% (1)
Mysql Assignment Cs - Class 12 Second Term Exerecise 1: Table: Member
4 pages
BBM 6th Sem Syllabus 2016
No ratings yet
BBM 6th Sem Syllabus 2016
10 pages
Cs 502 Lab Manual Final
No ratings yet
Cs 502 Lab Manual Final
27 pages
18CSL58 DBMS1@AzDOCUMENTS - in PDF
No ratings yet
18CSL58 DBMS1@AzDOCUMENTS - in PDF
129 pages
SQL - Part 2
No ratings yet
SQL - Part 2
26 pages
08 Query Processing Strategies and Optimization
No ratings yet
08 Query Processing Strategies and Optimization
32 pages
Relational Algebra
No ratings yet
Relational Algebra
66 pages
Tutorial - 10 - A2 and Query Optimization
No ratings yet
Tutorial - 10 - A2 and Query Optimization
16 pages
Introduction To Relational Model.
No ratings yet
Introduction To Relational Model.
30 pages
Session 34-DataCache&Recovery
No ratings yet
Session 34-DataCache&Recovery
8 pages
Clear
No ratings yet
Clear
60 pages
6 Query Optimization-Ch 16
No ratings yet
6 Query Optimization-Ch 16
35 pages
Internship Training in Python
No ratings yet
Internship Training in Python
2 pages
Database System Concepts, 7 Ed: ©silberschatz, Korth and Sudarshan See For Conditions On Re-Use
No ratings yet
Database System Concepts, 7 Ed: ©silberschatz, Korth and Sudarshan See For Conditions On Re-Use
50 pages
CH 2
No ratings yet
CH 2
26 pages
Top 10 Most Useful Excel Formulas PDF
No ratings yet
Top 10 Most Useful Excel Formulas PDF
3 pages
Answer : Find Missing Element in List of Array
No ratings yet
Answer : Find Missing Element in List of Array
3 pages
Relational Model Introduction For Noncse
No ratings yet
Relational Model Introduction For Noncse
45 pages
Conductor Log
No ratings yet
Conductor Log
17 pages
Mongoose
No ratings yet
Mongoose
26 pages
A System For Profiling and Monitoring Database Access Patterns by Application Programs For Anomaly Detection
No ratings yet
A System For Profiling and Monitoring Database Access Patterns by Application Programs For Anomaly Detection
18 pages
ch3 Formal-Rel
No ratings yet
ch3 Formal-Rel
94 pages
CH 2
No ratings yet
CH 2
22 pages
Ch6 Formal Query Language Korth
No ratings yet
Ch6 Formal Query Language Korth
95 pages
Chapter 27: Formal-Relational Query Languages
No ratings yet
Chapter 27: Formal-Relational Query Languages
19 pages
Mongo DB
No ratings yet
Mongo DB
19 pages
Chapter 2: Intro To Relational Model: Database System Concepts, 7 Ed
No ratings yet
Chapter 2: Intro To Relational Model: Database System Concepts, 7 Ed
29 pages
IF3140 Query Optimization
No ratings yet
IF3140 Query Optimization
77 pages
Bank Managment
100% (1)
Bank Managment
26 pages
Chapter 2: Intro To Relational Model
No ratings yet
Chapter 2: Intro To Relational Model
31 pages
1b Query Optimization Sil 7ed ch16
No ratings yet
1b Query Optimization Sil 7ed ch16
35 pages
Creating An ODI Project and Interface PDF
No ratings yet
Creating An ODI Project and Interface PDF
55 pages
IGNOU PGDCA MCS 208 Data Structure and Algorithm Previous Years Unsolved Papers
From Everand
IGNOU PGDCA MCS 208 Data Structure and Algorithm Previous Years Unsolved Papers
Manish Soni
No ratings yet
SAPGUI720 Installation Procedure
No ratings yet
SAPGUI720 Installation Procedure
9 pages
Chapter 6: Formal Relational Query Languages: Database System Concepts, 6 Ed
No ratings yet
Chapter 6: Formal Relational Query Languages: Database System Concepts, 6 Ed
93 pages
Chapter 6: Formal Relational Query Languages: Database System Concepts, 6 Ed
No ratings yet
Chapter 6: Formal Relational Query Languages: Database System Concepts, 6 Ed
27 pages
CH 2
No ratings yet
CH 2
22 pages
Chapter 2: Intro To Relational Model: Database System Concepts, 6 Ed
No ratings yet
Chapter 2: Intro To Relational Model: Database System Concepts, 6 Ed
25 pages
Chapter 13: Query Optimization: Database System Concepts, 6 Ed
No ratings yet
Chapter 13: Query Optimization: Database System Concepts, 6 Ed
62 pages
11 Ch13 Query Optimization
No ratings yet
11 Ch13 Query Optimization
54 pages
Oracle SQL and PL/SQL
From Everand
Oracle SQL and PL/SQL
Niraj Gupta
4.5/5 (8)
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
From Everand
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
Bruce Dang
No ratings yet

1b Query Optimization Sil 7ed Ch16

Uploaded by

1b Query Optimization Sil 7ed Ch16

Uploaded by

Data Management for Big Data

2019-2020 (spring semester)

Dario Della Monica

Chapter 16: Query Optimization

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

SELECT I.name, C.title

( ( INSTR (TEACHES COURSE ))) ( ( INSTR) (TEACHES COURSE ))

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

 Two relational algebra expressions are said to be equivalent if the two

2. Selection operations are commutative.

3. Only the last in a sequence of projection operations is

b.  (E1 2 E2) = E1 1 2 E2

(but the order is important for efficiency)

6. (a) Natural join operations are associative:

(b) Theta joins are associative in the following manner:

7. (a) Selection distribute over theta join in the following manner:

where 1 involves attributes from only E1

More equivalences at Ch. 16.2 of the book ⋆

STUD stud_id name surname

 Some equivalence can be derived from others

 Optimizers use minimal sets of equivalence rules

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

 Statistics are not totally accurate

 For each range

 In absence of statistical information or when v is unknown at time of cost estimation (e.g.,

 If histograms are available, we can do more precise estimates

 we compute si = size selection for θi (i = 1,…, n)

 SR(E) = Πi (SR(θi (r))) = s1 / nr * … * sn/ nr Πi is multiplication with i = 1,…,n

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

 The naïve approach

* This is the algorithm shown in the 6th edition of the textbook.

 Cost-based optimization is expensive, but worthwhile for queries on

Database System Concepts, 6th Ed.

The original version of the slides is available at: https://www.db-book.com/

You might also like