0% found this document useful (0 votes)

3 views64 pages

Chapter - 2 Query Processing

This document covers query processing and optimization in database systems, detailing the steps involved in transforming high-level queries into efficient execution strategies. It discusses various optimization techniques, including heuristic and cost-based methods, as well as the importance of query decomposition, semantic analysis, and restructuring. Additionally, it emphasizes the significance of minimizing evaluation time and provides examples of query optimization strategies.

Uploaded by

tedatujube

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views64 pages

Chapter - 2 Query Processing

Uploaded by

tedatujube

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 64

Advanced Database

Systems(CoSc2042)

Chapter Two

QUERY PROCESSING & OPTIMIZATION

Query Processing and Optimization: Outline
 Query processing
 Operator Evaluation Strategies
 Selection
 Join
 Query Optimization
 Heuristic query optimization
 Cost-based query optimization
 Query Tuning

2
Overview of Query Processing
 Query processing: The activities involved in parsing,

validating, optimizing, and executing a query.

 Aims

 To transform a query written in a high-level language,

typically SQL, into a correct and efficient execution
strategy expressed in a low-level language
(implementing the relational algebra), and
 To execute the strategy to retrieve the required data.

3
Steps of Query Processing
1. Parsing and
translation
2. Optimization
3. Evaluation

4
 DBMS has algorithms to implement relational algebra
expressions
 SQL is a kind of high level language; specify what is wanted,
not how it is obtained

5
6
Query optimization:
The activity of choosing an efficient execution
strategy for processing a query.
 Task: Find an efficient physical query plan (aka
execution plan) for an SQL query
Goal: Minimize the evaluation time for the query, i.e.,
compute query result as fast as possible
Cost Factors: Disk accesses, read/write operations,
[I/O, page transfer] (CPU time is typically ignored)
Optimization: find the most efficient evaluation plan for
a query because there can be more than one way.
7
Examples:

8
 Find all Managers who work at a London branch.
Example - 2
SELECT * FROM Staff s, Branch b WHERE s.branchNo = b.branchNo
AND (s.position = ‘Manager’ AND b.city = ‘London’);

9
The equivalent relational algebra queries corresponding
to this SQL statement are:

The
Different
Strategi
es

10
Cost Comparison
 Cost (in disk accesses) are:

(1) (1000 + 50) + 2(1000 50) = 101 050

(2) 2*1000 + (1000 + 50) = 3 050

(3) 1000 + 2*50 + 5 + (50 + 5) = 1 160

 The third option significantly reduces size of relations being joined together.

 Cartesian product and join operations are much more expensive than
selection.
We will see shortly that one of the fundamental strategies in query
processing is to perform the unary operations, Selection and Projection,
as early as possible, thereby reducing the operands of any subsequent
binary operations.
11
Phases of query processing

12
 Query Decomposition
 Transform high-level query into RA query.

 Check that query is syntactically and semantically

correct.
 Typical stages are:
 Analysis,

 Normalization,

 Semantic analysis,

 Simplification,

 Query restructuring.
13
 Analysis
 Analyze query lexically and syntactically using compiler techniques.
 Verify relations and attributes exist.
 Verify operations are appropriate for object type.

14
Analysis
 Finally, query transformed into a query tree constructed as follows:

Leaf node for each base relation.

Non-leaf node for each intermediate relation produced by RA operation.

Root of tree represents query result.

 Sequence is directed from leaves to root.

15
Normalization
 Converts query into a normalized form for easier
manipulation.
 Predicate can be converted into one of two forms:

 Conjunctive normal form:

(position = 'Manager'  salary > 20000)  (branchNo =

'B003')
 Disjunctive normal form:

(position='Manager'branchNo='B003')(salary>20000branc
hNo ='B003')
16
Semantic Analysis
 Rejects normalized queries that are incorrectly
formulated or contradictory.
 Query is incorrectly formulated if components do
not contribute to generation of result.
 Query is contradictory if its predicate cannot be
satisfied by any tuple.
 Algorithms to determine correctness exist only for
queries that do not contain disjunction and
negation.

17
Semantic Analysis
 To detect
➠ connection graph (query graph)
➠ join graph

18
Relation connection graph

 Relation connection graph not

fully connected, so query is not
correctly formulated.
 Have omitted the join condition
(v.propertyNo = p.propertyNo) .

19
Example 2
SELECT Ename,Resp FROM Emp, Works, Project
WHERE Emp.Eno = Works.Eno AND Works.Pno =
Project.Pno AND Pname = ‘CAD/CAM’ AND Dur > 36
AND Title = ‘Programmer’

If the query graph is connected, the query is

semantically correct.
20
Simplification
• Detects redundant qualifications,
• Eliminates common sub-expressions,
• Transforms query to semantically
equivalent but more easily and efficiently
computed form.
 Apply well-known transformation rules of Boolean
algebra.

21
Example
 SELECT TITLE FROM Emp E WHERE(NOT (TITLE=
“Programmer”) AND TITLE=“Programmer” ) OR
(TITLE=”Electrical Eng.” AND NOT (TITLE=“Electrical
Eng.”))OR ENAME=“J.Doe”; is

equivalent to
 SELECT TITLE FROM Emp E WHERE ENAME=
“J.Doe”;


22
Restructuring
 Convert
. SQL to relational
algebra
 Make use of query trees
 Example: SELECT Ename FROM
Emp, Works, Project WHERE
Emp.Eno = Works.Eno AND
Works.Pno = Project.Pno AND
Ename <> ‘J. Doe’ AND Pname =
‘CAD/CAM’ AND (Dur = 12 OR
Dur = 24)
23
 Query tree:
 A tree data structure that corresponds to a relational algebra
expression.
 It represents the input relations of the query as leaf nodes
of the tree, and represents the relational algebra operations
as internal nodes.
 Query graph:
 A graph data structure that corresponds to a relational
calculus expression.
 It does not indicate an order on which operations to perform
first.
 There is only a single graph corresponding to each query.
24
Transformation Rules for RA
Operations
1. Conjunctive Selection operations can cascade into
individual Selection operations (and vice versa).

 Sometimes referred to as cascade of Selection.

2. Commutativity of Selection.

25
3. In a sequence of Projection operations, only the last in the
sequence is required.

∏Col_list1 (∏Col_list2 (… (∏Col_listN (T))….)) = ∏Col_list1 (T)

∏ STD_ID, STD_NAME (∏STD_ID, STD_NAME, AGE, ADDRESS
(∏STD_ID, STD_NAME, AGE, ADDRESS, CLASS_ID, SKILLS
(STUDENT))) = ∏STD_ID, STD_NAME (STUDENT)

26
Con…
4. Commutativity of Selection and Projection.
If predicate p involves only attributes in projection
list, Selection and Projection operations commute:

27
Con…
5. Commutativity of Theta join (and Cartesian
product).

Rule also applies to Equijoin and Natural join.

Example:

28
6. Commutativity of Selection and Theta join (or
Cartesian product)
 If selection predicate involves only attributes of one
of join relations, Selection and Join (or Cartesian
product) operations commute:

 If selection predicate is conjunctive predicate having

form (p  q), where p only involves attributes of R,
and q only attributes of S, Selection and Theta join
operations commute as:

29
7. Commutativity of Projection &Theta join (or
Cartesian product)

30
8. Commutativity of Union & Intersection (but not set
difference)
RS=SR
RS=SR
9.Commutativity of Selection and set operations
(Union, Intersection, and Set difference).
p(R  S) = p(S)  p(R)
p(R  S) = p(S)  p(R)
p(R - S) = p(S) - p(R)

10.Commutativity of Projection and Union.

L(R  S) = L(S)  L(R)

11. Associativity of Union & Intersection (but not Set

difference).
31 (R  S)  T = S  (R  T), (R  S)  T = S  (R
 T)
12 . Associativity of Theta join (and Cartesian product).
 Cartesian product and Natural join are always
associative.

32
2. Query Optimization
Optimization – not necessarily “optimal”, but
reasonably efficient
Techniques:

Heuristic rules

 Query tree (relational algebra) optimization

 Query graph optimization

Cost-based (physical) optimization

 Cost estimation(Comparing costs of different

plans)
33
a. Heuristic based Processing
Strategies
► Perform Selection operations as early as possible.
►Keep predicates on same relation together.
►Combine Cartesian product with subsequent Selection
whose predicate represents join condition into a Join
operation.
►Use associativity of binary operations to rearrange leaf
nodes so leaf nodes with most restrictive Selection
operations executed first.
►Perform Projection as early as possible.
►Keep projection attributes on same relation together.
►Compute common expressions once.
►If common expression appears more than once, and
34 result not too large, store result and reuse it when
Examples
 What are the names of customers living on Elm
Street who have checked out “Terminator”?
 SQL query:
SELECT Name FROM Customer CU, CheckedOut CH, Film F
WHERE Title = ’Terminator’ AND F.FilmId = CH.FilmID AND
CU.CustomerID = CH.CustomerID AND CU.Street = ‘Elm’

35
Apply Selections Early

36
Apply More Restrictive Selections Early

37
Form Joins

38
Apply Projections Early

39
Cost- Based Optimization
 Statistics on the inputs to each operator are needed.
 Statistics on leaf relations are stored in the system
catalog.
 Statistics on intermediate relations must be estimated;
most
important is the relations' cardinalities.
 Cost can be CPU time, I/O time, communication time, main
memory usage, or a combination.
 The candidate query tree with the least total cost is selected
for execution.

40
Measures of Query Cost
 There are many possible ways to estimate cost, e.g.,
based on
disk accesses, CPU time, or communication overhead.
 Disk access is the cost of block transfers from/to
disks.
 Simplifying assumption: each block transfer has
the same cost
 Cost of algorithm (e.g., for join or selection) depends
on database buffer size;
 More memory for DB buffer reduces disk accesses.
Selectivity and Cost Estimates in Query
Optimization
 Catalog Information Used in Cost Functions
 Information about the size of a file
 number of records (tuples) (r),
 record size (R),
 number of blocks (b)
 blocking factor (bfr)
 Information about indexes and indexing attributes of
a file
 Number of levels (x) of each multilevel index
 Number of first-level index blocks (bI1)
 Number of distinct values (d) of an attribute
 Selectivity (sl) of an attribute
 Selection cardinality (s) of an attribute. (s = sl * r)
Database Statistics
 For each base relation R
 nTuples(R) – the number of tuples (records) in relation R (that is, its
cardinality).
 bFactor(R) – the blocking factor of R (that is, the number of tuples of R that
fit into one block).
 nBlocks(R) – the number of blocks required to store R. If the tuples of R
are stored physically together, then:
 nBlocks(R) = [nTuples(R)/bFactor(R)]
 We use [x] to indicate that the result of the calculation is rounded to the
smallest integer that is greater than or equal to x.
For each attribute A of base relation R
 nDistinctA(R) – the number of distinct values that appear for
attribute A in relation R.
 minA(R),maxA(R) – the minimum and maximum possible
values for the attribute A in relation R.
 SCA(R) – the selection cardinality of attribute A in relation R.
 This is the average number of tuples that satisfy an equality
condition on attribute A.

44
Con…

For each multilevel index I on attribute set A

•nLevelsA(I) – the number of levels in I.
•nLfBlocksA(I) – the number of leaf blocks in I.
45
Con…
 The cost of Selection Operation (S = sσ(R)) is
also calculated as;

46
Example
 For the purposes of this example, we make the following assumptions about
the Staff relation:
 There is a hash index with no overflow on the primary key attribute staffNo.
 There is a clustering index on the foreign key attribute branchNo.
 There is a B+-tree index on the salary attribute.
 The Staff relation has the following statistics stored in the system catalog:

47
Q1
 The estimated cost of a linear search on the key attribute staffNo is 50 blocks,
and the cost of a linear search on a non-key attribute is 100 blocks.
 Now we consider the following Selection operations, and use the above
strategies to improve on these two costs:
 S1:σstaffNo=‘SG5’(Staff)
 S2:σposition=‘Manager’(Staff)
 S3:σbranchNo=‘B003’(Staff)
 S4:σsalary>20000(Staff)
 Solution: S1: This Selection operation contains an equality condition on the
primary key. Therefore, as the attribute staffNo is hashed we can use strategy
3 defined above to estimate the cost as 1 block. The estimated cardinality of
48
the result relation is SCstaffNo(Staff) = 1.
S2: The attribute in the predicate is a non-key, non-indexed attribute, so we cannot
improve on the linear search method, giving an estimated cost of 100 blocks. The
estimated cardinality of the result relation is SCposition(Staff) = 300.
S3: The attribute in the predicate is a foreign key with a clustering index, so we can use
Strategy 6 to estimate the cost as 2 + [6/30] = 3 blocks. The estimated cardinality of the
result relation is SCbranchNo(Staff) = 6.

S4: The predicate here involves a range search on the salary attribute, which has a B+-
tree index, so we can use strategy 7 to estimate the cost as: 2 + [50/2] + [3000/2] = 1527
blocks. However, this is significantly worse than the linear search strategy, so in this case
we would use the linear search method. The estimated cardinality of the result relation is

SCsalary(Staff) = [3000*(50000–20000)/(50000–10000)] = 2250

49
Selection Operation

σA=a(R) where a is a constant value, A an attribute

of R

File Scan - search algorithms that locate and

retrieve records
that satisfy a selection condition

S1 - Linear search
cost(S1)= BR

S2 - Binary search, i.e., the file ordered based

on attribute A (primary index)
50
Con…

51
52
Cost of Operations
 Cost = I/O cost + CPU cost
 I/O cost: # pages (reads & writes) or # operations
(multiple pages)
 CPU cost: # comparisons or # tuples processed

 I/O cost dominates (for large databases)

 Cost depends on
 Types of query conditions

 Availability of fast access paths

 53 DBMSs keep statistics for cost estimation

Notations

 Used to describe the cost of operations.

 Relations: R, S

 nR: # tuples in R, nS: # tuples in S

 bR: # pages in R

 dist(R.A) : # distinct values in R.A

 min(R.A) : smallest value in R.A

 max(R.A) : largest value in R.A

 HI: # index pages accessed (B+ tree height?)

54
Simple Selection
 Simple selection: A op a(R)
 A is a single attribute, a is a constant, op is one of =, ,
<, , >, .
 Do not further discuss  because it requires a
sequential scan of table.
How many tuples will be selected?
 Selectivity Factor (SFA op a(R)) : Fraction of tuples of R
satisfying “A op a”
 0  SFA op a(R)  1
# tuples selected: NS = nR  SFA op a(R)

55
Options of Simple Selection
Sequential (linear) Scan
 General condition: cost = bR
 Equality on key: average cost = bR / 2
Binary Search
 Records are stored in sorted order
 Equality on key: cost = log2(bR)
 Equality on non-key (duplicates allowed)
cost = log2(bR) + NS/bfR - 1
= sorted search time + selected – first one

56
Example: Cost of Selection
Relation: R(A, B, C)
nR = 10000 tuples
bfR = 20 tuples/page
dist(A) = 50, dist(B) = 500
B+ tree clustering index on A with order 25 (p=25)
B+ tree secondary index on B w/ order 25
Query:
 select * from R where A = a1 and B = b1
Relational Algebra: A=a1  B=b1 (R)

57
Example: Cost of Selection (cont.)
Option 1: Sequential Scan
 Have to go thru the entire relation
 Cost = bR = 10000/20 = 500
Option 2: Binary Search using A = a
 It is sorted on A (why?)
 NS = 10000/50 = 200
 assuming equal distribution
 Cost = log2(bR) + NS/bfR - 1
= log2(500) + 200/20 - 1 = 18

58
Cost of Join

Cost = # I/O reading R & S + # I/O writing

result
Additional notation:
 M: # buffer pages available to join operation
 LB: # leaf blocks in B+ tree index
Limitation of cost estimation
 Ignoring CPU costs
 Ignoring timing
 Ignoring double buffering requirements

59
Estimate Size of Join Result

How many tuples in join result?

 Cross product (special case of join)
NJ = nR  nS
 R.A is a foreign key referencing S.B
NJ = nR (assume no null value)
 S.B is a foreign key referencing R.A
NJ = nS (assume no null value)
 Both R.A & S.B are non-key

nR nS nR nS
NJ = min( , )
dist(R. A) dist(S .B)
60
Estimate Size of Join Result (cont.)
How wide is a tuple in join result?
 Natural join: W = W(R) + W(S) – W(SR)
 Theta join: W = W(R) + W(S)
What is blocking factor of join result?
 bfJoin = block size / W
How many blocks does join result have?
 bJoin = NJ / bfJoin

61
Query Execution Plans
 An execution plan for a relational algebra query consists
of a combination of the relational algebra query tree
and information about the access methods to be used
for each relation as well as the methods to be used in
computing the relational operators stored in the tree.
 Materialized evaluation: the result of an operation is
stored as a temporary relation.
 Pipelined evaluation: as the result of an operator is
produced, it is forwarded to the next operator in
sequence
62
Query Tuning
 Monitoring or revising the query to increase throughput,
to lower response time for time-critical applications.
 Having to tune queries is a fact of life.
 Query tuning has a localized effect and is thus relatively
attractive.

 It is a time-consuming and specialized task.

 It makes the queries harder to understand.

 However, it is often a necessity.

 This is not likely to change any time soon.

63
Assignment one
 Using heuristic algorithm optimize the following
sql query.
SELECT LNAME FROM EMPLOYEE, WORKS_ON, PROJECT

WHERE PNAME = ‘AQUARIUS’ AND

PNMUBER=PNO AND ESSN=SSN
AND BDATE > ‘1957-12-31’;

Planning BPC
No ratings yet
Planning BPC
184 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
IBM Unica Campaign Administration 8.6 Training: Description
No ratings yet
IBM Unica Campaign Administration 8.6 Training: Description
3 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
Query Processing
No ratings yet
Query Processing
66 pages
Chapter 2 Query processing and optimization [Autosaved]
No ratings yet
Chapter 2 Query processing and optimization [Autosaved]
35 pages
Advanced Database System Chapter Two Query Processing and Optimization
No ratings yet
Advanced Database System Chapter Two Query Processing and Optimization
50 pages
Chapter 20
No ratings yet
Chapter 20
99 pages
Ch-2 (B) Overview of Query Processing
No ratings yet
Ch-2 (B) Overview of Query Processing
73 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
63 pages
Chapter 1 Query Processing
No ratings yet
Chapter 1 Query Processing
58 pages
Itm661 Lecture03 Part2 2015
No ratings yet
Itm661 Lecture03 Part2 2015
47 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
Chapter 2 - Query Processing and Optimization
100% (1)
Chapter 2 - Query Processing and Optimization
28 pages
DE_Module5_QueryOptimization
No ratings yet
DE_Module5_QueryOptimization
11 pages
Query Processing 1
No ratings yet
Query Processing 1
13 pages
Query Processing
No ratings yet
Query Processing
28 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
ADBChapter 1
No ratings yet
ADBChapter 1
32 pages
Advanced Database Systems Chapter 2
100% (1)
Advanced Database Systems Chapter 2
16 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Advanced Database Ch2 and 3
100% (1)
Advanced Database Ch2 and 3
73 pages
Chapter 2 Query Processing and Optimization
No ratings yet
Chapter 2 Query Processing and Optimization
58 pages
Advanced Database
No ratings yet
Advanced Database
47 pages
2 Chapter 3 Query Optimization
No ratings yet
2 Chapter 3 Query Optimization
29 pages
Module - 4
No ratings yet
Module - 4
60 pages
Dbms Seminar
No ratings yet
Dbms Seminar
24 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Chapter 6 - Query Processing and Optimization Algorithm
No ratings yet
Chapter 6 - Query Processing and Optimization Algorithm
27 pages
ADBMS Notes
67% (3)
ADBMS Notes
48 pages
DBMS Unit - 7
No ratings yet
DBMS Unit - 7
34 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
45 pages
DBMS Unit - 7
No ratings yet
DBMS Unit - 7
33 pages
Chapter 2 Querry Proccessing
No ratings yet
Chapter 2 Querry Proccessing
7 pages
Chapter 2-Query Processing and Optimi
No ratings yet
Chapter 2-Query Processing and Optimi
43 pages
AMSAL
No ratings yet
AMSAL
58 pages
Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
40 pages
Module-4
No ratings yet
Module-4
8 pages
17 Query Processing PDF
No ratings yet
17 Query Processing PDF
23 pages
Chapter 2 Adb
No ratings yet
Chapter 2 Adb
21 pages
CH 02
No ratings yet
CH 02
127 pages
Adb_ch2
No ratings yet
Adb_ch2
72 pages
29-Query Optimization-04-10-2024
No ratings yet
29-Query Optimization-04-10-2024
35 pages
Unit-5 Query Processing and Optimization
No ratings yet
Unit-5 Query Processing and Optimization
40 pages
Chapter 2 Query Processing and Optimization
No ratings yet
Chapter 2 Query Processing and Optimization
45 pages
ADB Chapter 2
No ratings yet
ADB Chapter 2
40 pages
ADB Chapter 2 DB Part1
No ratings yet
ADB Chapter 2 DB Part1
10 pages
CO3 Session 7
No ratings yet
CO3 Session 7
32 pages
QUERY Processing and Relational Algebra
No ratings yet
QUERY Processing and Relational Algebra
27 pages
Query Optimization
No ratings yet
Query Optimization
103 pages
Ad Bms Notes
No ratings yet
Ad Bms Notes
44 pages
ch2. pdf
No ratings yet
ch2. pdf
72 pages
Advancedchapter 2 2013
No ratings yet
Advancedchapter 2 2013
16 pages
Advanced Database Systems Chapter One Query Processing & Optimization
No ratings yet
Advanced Database Systems Chapter One Query Processing & Optimization
22 pages
CH 1 Query Processing
No ratings yet
CH 1 Query Processing
38 pages
Chapter 1 - Query Processing and Optimization
No ratings yet
Chapter 1 - Query Processing and Optimization
62 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
42 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Marksheet & Database Making
No ratings yet
Marksheet & Database Making
6 pages
Adeptia
No ratings yet
Adeptia
3 pages
DB Test 1
No ratings yet
DB Test 1
3 pages
Fabric Data Warehouse
No ratings yet
Fabric Data Warehouse
686 pages
Data Base of Railways Reservation System in India
80% (35)
Data Base of Railways Reservation System in India
14 pages
Interview Questions
No ratings yet
Interview Questions
3 pages
Cloudera Introduction PDF
No ratings yet
Cloudera Introduction PDF
85 pages
IT SBA 2020-2021 - Mark Scheme - Database - T GREEN
No ratings yet
IT SBA 2020-2021 - Mark Scheme - Database - T GREEN
1 page
Design and Implementation of A Hostel's Room Allocation Sysytem
No ratings yet
Design and Implementation of A Hostel's Room Allocation Sysytem
61 pages
Java Resume
No ratings yet
Java Resume
1 page
snowpro_core_dumps
No ratings yet
snowpro_core_dumps
396 pages
William Wizner - Python For Data Science - Data Analysis and Deep Learning With Python Coding and Programming
100% (1)
William Wizner - Python For Data Science - Data Analysis and Deep Learning With Python Coding and Programming
73 pages
17.2.6 Lab - Attacking A MySQL Database - Class - 18221013
No ratings yet
17.2.6 Lab - Attacking A MySQL Database - Class - 18221013
12 pages
Advanced Java Programming
No ratings yet
Advanced Java Programming
3 pages
Indexing and B+ Tress
No ratings yet
Indexing and B+ Tress
6 pages
Database Schema Instances
No ratings yet
Database Schema Instances
6 pages
Data Analyst Steinbeis
No ratings yet
Data Analyst Steinbeis
30 pages
Replit Prompt
No ratings yet
Replit Prompt
3 pages
Computer Science Practical File
No ratings yet
Computer Science Practical File
43 pages
MQP.12.CS.05.10.2024
No ratings yet
MQP.12.CS.05.10.2024
8 pages
Cassandra: Advanced Topics On Nosql Databases
No ratings yet
Cassandra: Advanced Topics On Nosql Databases
7 pages
Advertisment Management System
No ratings yet
Advertisment Management System
61 pages
Tableau Part 1
No ratings yet
Tableau Part 1
160 pages
LIS Interview Questions and Answers Seri
No ratings yet
LIS Interview Questions and Answers Seri
6 pages
Design a Ride-Sharing System like Uber or Lyft
No ratings yet
Design a Ride-Sharing System like Uber or Lyft
3 pages
Dbms Practical File
No ratings yet
Dbms Practical File
20 pages
Unit-4_DBMS_AIDS_R20
No ratings yet
Unit-4_DBMS_AIDS_R20
16 pages
Postgres Pro vs EDB_v16
No ratings yet
Postgres Pro vs EDB_v16
7 pages

Chapter - 2 Query Processing

Uploaded by

Chapter - 2 Query Processing

Uploaded by

Advanced Database

QUERY PROCESSING & OPTIMIZATION

validating, optimizing, and executing a query.

 To transform a query written in a high-level language,

(1) (1000 + 50) + 2*(1000 * 50) = 101 050

(2) 2*1000 + (1000 + 50) = 3 050

(3) 1000 + 2*50 + 5 + (50 + 5) = 1 160

 Check that query is syntactically and semantically

Leaf node for each base relation.

Non-leaf node for each intermediate relation produced by RA operation.

Root of tree represents query result.

 Conjunctive normal form:

(position = 'Manager'  salary > 20000)  (branchNo =

 Relation connection graph not

If the query graph is connected, the query is

 Sometimes referred to as cascade of Selection.

∏Col_list1 (∏Col_list2 (… (∏Col_listN (T))….)) = ∏Col_list1 (T)

Rule also applies to Equijoin and Natural join.

 If selection predicate is conjunctive predicate having

10.Commutativity of Projection and Union.

11. Associativity of Union & Intersection (but not Set

 Query tree (relational algebra) optimization

 Query graph optimization

Cost-based (physical) optimization

 Cost estimation(Comparing costs of different

For each multilevel index I on attribute set A

SCsalary(Staff) = [3000*(50000–20000)/(50000–10000)] = 2250

σA=a(R) where a is a constant value, A an attribute

File Scan - search algorithms that locate and

S2 - Binary search, i.e., the file ordered based

 I/O cost dominates (for large databases)

 Availability of fast access paths

 53 DBMSs keep statistics for cost estimation

 Used to describe the cost of operations.

 nR: # tuples in R, nS: # tuples in S

 dist(R.A) : # distinct values in R.A

 min(R.A) : smallest value in R.A

 max(R.A) : largest value in R.A

 HI: # index pages accessed (B+ tree height?)

Cost = # I/O reading R & S + # I/O writing

How many tuples in join result?

 It is a time-consuming and specialized task.

 It makes the queries harder to understand.

 However, it is often a necessity.

 This is not likely to change any time soon.

WHERE PNAME = ‘AQUARIUS’ AND

You might also like

(1) (1000 + 50) + 2(1000 50) = 101 050