0% found this document useful (0 votes)

8 views24 pages

Dbms Seminar

The document outlines the steps involved in query processing and optimization in distributed databases, including scanning, parsing, validating, and evaluating queries. It discusses various algorithms for selecting and joining operations, emphasizing the importance of query optimization techniques such as heuristic rules and cost estimation for efficient execution. The document also details the cost components associated with query execution, including access, storage, computation, memory usage, and communication costs.

Uploaded by

Gayathri Ramasamy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views24 pages

Dbms Seminar

Uploaded by

Gayathri Ramasamy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

UNIT - 5

Q U E RY P R O C E S S I N G
&
DISTRIBUTED
D ATA B A S E S

VALARMATHI
M
II – CSE D
OVRVIEW
Query Processing Steps:
1. Scanning, Parsing, Validating and
translation
2. Optimization
3. Evaluation
OPTIMIZATION
Query Optimization: Amongst all equivalent
evaluation plans choose the one with lowest cost.
 Cost is estimated using statistical information
from the
database catalog.
e.g. number of tuples in each relation, size of
tuples, etc.

E.g., salary75000(salary(instructor)) is equivalent

to
salary(salary75000(instructor))

 Consider total rows as 1000 and number of

rows satisfying the given condition as 400.
ALGORITHMS FOR
SELECT OPERATION
For SIMPLE queries ,

• Linear Search: Scan each file block and test all

records to see whether they satisfy the selection
condition.
Linear search can be applied regardless of
selection
condition or ordering of records in the file, or
availability of indices.
• Binary Search: It is possible only when the data
is present in sorted order and index values are
needed.

Index: An index in DBMS contains key values

from indexed columns and pointers (references)
SELECT using index values,

i) Using primary key:

 Equality comparison is made on a key
attribute – return single record
 Compariosn condition is <,>,<=.>= on a
key attribute – return multiple records.

ii) Using secondary key:

 Equality comparison is made on non key
attribute
 If the fields in the relation have unique
values only – then single record is returned.
 If duplicates exists , then multiple records
are returned.
Eg: Order: O_ID P_NAME C_ID
(Primary)
1 AAA 1001
2 BBB 1002
3 CCC 1001
Linear, Binary:  (O_ID=2) (Order)

Primary – single :  (O_ID =3) (Order)

Primary – multiple :  (O_ID >=2) (Order)

Secondary – single :  (C_ID = 1002) (Order)

Secondary – multiple :  (C_ID = 1001)
(Order) (or)
 (C_ID > 1000)
For complex queries,

 Conjunctive Select using Individual Index – The

database scans separate indexes for each condition and
finds the intersection of matching records.

 Conjunctive Select using Composite Index – The

composite index directly filters records based on multiple
conditions together, improving efficiency.

 Conjunctive Select using Record Pointers – The

indexes retrieve record pointers for each condition, and
only records matching all conditions are selected.

 Disjunctive selection by union of identifiers - The

result includes records that satisfy at least one of the
given conditions.
Eg: O_ID P_ID C_ID
1 101 1001
2 102 1002
3 101 1003
4 103 1001
5 102 1003
Primary key – O_ID
Composite key – (P_ID,C_ID)
Query:  ( P_ID=101 AND C_ID=1003) (Order)
1. Conj.selc using individual index: Two conditions
are executed separately and finally intersected.
2. Conj.selc using composite index: comp.keys are
(101,1001),(102,1002),(101,1003)…(102,1003).
3.Conjuctive selection using record pointers:
For P_ID=101, record pointers for O_ID are 1,3
For C_ID=1003, record pointers for O_ID are
3,5
Intersecting
O_ID
both we get
P_ID C_ID
3 101 1003

4. Disjunctive selection by union of identifiers:

Query:  ( P_ID=101 OR C_ID=1002) (Order)
O_ID P_ID C_ID
1 101 1001
2 102 1002
3 101 1003
ALGORITHMS FOR JOIN
OPERATION
Emp_ID Name Dept_ID Dept_Nam
Dept_ID
e
1 Alice 101
2 Bob 102 101 HR

3 Charlie 101 102 IT

4 David 103 103 Sales

1. Nested Loop Join (Brute Force) :

 It compares each row in the Employee (E) table with every row in
the Department (D) table.
Example:
 Alice (Dept_ID = 101) is compared with all Dept_IDs in D. It
matches with HR.
 Bob (Dept_ID = 102) is compared with all Dept_IDs in D. It
matches with IT.
2. Single Loop Join (Index Nested Loop Join):
 Uses an index on Dept_ID in D for fast lookup.
 Instead of checking every row, it directly finds matches using
indexing. Index for Dep_ID

Dept_ID Pointer to Employee Row

(1, Alice, 101), (3, Charlie,
101
101)
102 (2, Bob, 102)

103 (4, David, 103)

3. Sort-Merge Join (SMJ):

 Used when both tables are sorted on the join key.
 Efficient for large datasets as it avoids a full table scan.
 Works in two steps:
o Sorting Phase – Both tables are sorted based on the join
key.
o Merge Phase – Tables are scanned once in order,
matching values efficiently.
4.Hash join:
 Works by hashing one table (smaller one) and probing the
other.
 Used when no sorting or index exists on the join column.
 Works in two steps:
 Build Phase – Create a hash table for the smaller table
(e.g., Department).
 Probe Phase – Scan the larger table (Employee) and
match using the hash table.
Hash table: Resultant table:
Fn= (Dept_Id %10)
Dept_Na
3 Sales Emp_ID Name Dept_ID
me
1 Alice 101 HR
2 IT 3 Charlie 101 HR

2 Bob 102 IT
1 HR 4 David 103 Sales
Final Conclusion:

 All joins produce the same result—only

the processing method differs.

 Nested Loop: Simple but slow.

 Index Nested Loop: Faster if an index is

available.

 Sort-Merge: Efficient if data is already

sorted.

 Hash Join: Best for large datasets with

enough memory.
QUERY OPTIMIZATION
USING
Introduction:
HEURISTICS
Heuristic query optimization applies predefined rules
(heuristics) to rearrange and simplify a query before
execution, improving efficiency. These rules help reduce query
cost by minimizing data retrieval and computation
overhead.

Rules:
1) Draw initial query tree.
2) Move SELECT down the tree
3) Move Restrictive SELECT operation
4) Replace CARTESIAN PRODUCT and SELECT operation with
JOIN operation.
5) Move PROJECT operation down the tree.

Eg:
Employee (Fname,Lnmae,ssn,Bdate,Address,Dno);
Works_for (Essn,Pno,hours);
Project (Pname,Pnum,Plocation,Dnum)
Step – 1: (a) Initial (canonical) query tree for SQL
query Q.
Step-2: Moving SELECT operations down the
query tree.
Step – 3: Applying the more restrictive
SELECT operation first.
Step – 4: Replacing CARTESIAN PRODUCT
and SELECT with JOIN operations
Step – 5: Moving PROJECT operations
down the query tree.
COST ESTIMATION
 The main aim of query optimization is to
choose the most efficient way of
implementing the relational algebra
operations at the lowest possible cost.
 The query optimizer should not depend
solely on heuristic rules, but it should also
estimate the cost of executing the different
strategies and find out the strategy with the
minimum cost estimate.
 The cost functions are only estimates and
not exact values.
 The cost depends on the cardinality of the
inputs.
Cost Components of Query
Execution
• The cost of executing the query includes the
following components:-
 Access cost to secondary storage.
 Storage cost.
 Computation cost.
 Memory uses cost.
 Communication cost.
i. Access Cost to Secondary Storage
 Disk I/O Cost → Reading/writing tables and indexes from
disk.
 Index Lookup Cost → Searching for indexed records.
 Sequential vs. Random Access Cost → Sequential
scans are cheaper than random accesses.
ii) Storage Cost
 Data Storage Cost → Space occupied by tables and
indexes.
 Index Storage Cost → Extra space required for maintaining
indexes.
 Temporary Storage Cost → Space needed for intermediate
query results.

iii) Computation Cost

 CPU Cost → Processing operations like filtering (WHERE),
sorting (ORDER BY), joining, and aggregation.
 Function Evaluation Cost → Cost of executing functions
(e.g., AVG(), SUM()).
 Join Operation Cost → Nested loop, hash join, or sort-
merge join computation costs.
iv) Memory Usage Cost
 Buffer Pool Cost → Memory required to store frequently
accessed pages.
 Sorting and Hashing Cost → Memory used in operations
like sorting (ORDER BY) and hashing (HASH JOIN).
 Intermediate Result Storage Cost → Memory used for
temporary query results.

v) Communication Cost (For Distributed Databases)

 Data Transfer Cost → Cost of sending query results
between servers.
 Query Coordination Cost → Overhead of coordinating
execution across multiple nodes.
 Network Latency Cost → Time delay in data transmission
over the network.

1901 2022412984 SC400T00AENUTrainerHandbook
100% (2)
1901 2022412984 SC400T00AENUTrainerHandbook
194 pages
Im Smartcool e 6877419 V1.5.0 10 14
No ratings yet
Im Smartcool e 6877419 V1.5.0 10 14
222 pages
JSP Interview Questions and Answers
No ratings yet
JSP Interview Questions and Answers
7 pages
03 CP PDF
No ratings yet
03 CP PDF
8 pages
Large Rhombicosidodecahedron PDF
No ratings yet
Large Rhombicosidodecahedron PDF
11 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
Computer Science Extended Essay First Draft (Second Version)
No ratings yet
Computer Science Extended Essay First Draft (Second Version)
10 pages
Scada Protocols and Communication Trends: by Rao Kalapatapu
No ratings yet
Scada Protocols and Communication Trends: by Rao Kalapatapu
11 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Database Technology Query Processing: Heiko Paulheim
No ratings yet
Database Technology Query Processing: Heiko Paulheim
60 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
Advance Database Chapter 1-1
No ratings yet
Advance Database Chapter 1-1
76 pages
20mark The Letter A, B, C, or D On Your Answer Sheet To Indicate The Correct Answer To Each of The Following Questions
No ratings yet
20mark The Letter A, B, C, or D On Your Answer Sheet To Indicate The Correct Answer To Each of The Following Questions
6 pages
AIES Unit 5 (2022)
No ratings yet
AIES Unit 5 (2022)
98 pages
SZALAY Et Al-ICTINROADVEHICLESOBDvsCAN
No ratings yet
SZALAY Et Al-ICTINROADVEHICLESOBDvsCAN
8 pages
Ad Database All Slide
No ratings yet
Ad Database All Slide
49 pages
Flask WTF
No ratings yet
Flask WTF
29 pages
06 Query Processing (2) - NDN
No ratings yet
06 Query Processing (2) - NDN
31 pages
Notes 1
No ratings yet
Notes 1
97 pages
Dianne Flacks Resume - New
No ratings yet
Dianne Flacks Resume - New
2 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
45 pages
Algorithms For Query Processing and Optimization
No ratings yet
Algorithms For Query Processing and Optimization
53 pages
Operations Management Report On The Itc Echoupal Initiative
No ratings yet
Operations Management Report On The Itc Echoupal Initiative
13 pages
Chapter 15
No ratings yet
Chapter 15
66 pages
Query Processing and Optimization: Dessalegn Mequanint
No ratings yet
Query Processing and Optimization: Dessalegn Mequanint
31 pages
N2OS-UserManual-20 0 7 4
No ratings yet
N2OS-UserManual-20 0 7 4
256 pages
Chapter 13: Query Processing: Database System Concepts, 5th Ed
No ratings yet
Chapter 13: Query Processing: Database System Concepts, 5th Ed
55 pages
Implementation of Mis: Unit - 7
No ratings yet
Implementation of Mis: Unit - 7
15 pages
1.3 PPT - Measure of Query Cost
100% (1)
1.3 PPT - Measure of Query Cost
42 pages
Dbms Chapter 5
No ratings yet
Dbms Chapter 5
54 pages
BCS Topic
No ratings yet
BCS Topic
66 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
14 pages
Windows 10 1909 GP OS Administrative Guide
No ratings yet
Windows 10 1909 GP OS Administrative Guide
118 pages
Chapter 2
No ratings yet
Chapter 2
64 pages
SQL Queries To Generate Reports
No ratings yet
SQL Queries To Generate Reports
8 pages
13 QP1
No ratings yet
13 QP1
33 pages
Chapter 2 Query Processing & Optmzn
No ratings yet
Chapter 2 Query Processing & Optmzn
64 pages
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
No ratings yet
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
20 pages
QEII
No ratings yet
QEII
44 pages
Relational Query Optimization: Warih Maharani, ST.,MT
No ratings yet
Relational Query Optimization: Warih Maharani, ST.,MT
39 pages
Bytedance Ai Lab Ava Challenge 2019 Technical Report
No ratings yet
Bytedance Ai Lab Ava Challenge 2019 Technical Report
2 pages
CH 1 Query Processing
No ratings yet
CH 1 Query Processing
38 pages
TN Apogee Prepress 10.0 - Apogee Impose
No ratings yet
TN Apogee Prepress 10.0 - Apogee Impose
49 pages
ECE 2006 Semester II
No ratings yet
ECE 2006 Semester II
4 pages
JCS2121 Prog in C Syllabus
No ratings yet
JCS2121 Prog in C Syllabus
2 pages
Advance Database Management System: Unit - 2 .Query Processing and Optimization
No ratings yet
Advance Database Management System: Unit - 2 .Query Processing and Optimization
38 pages
Ch0 Introduction
No ratings yet
Ch0 Introduction
13 pages
Apt-Ipt Procedure
No ratings yet
Apt-Ipt Procedure
13 pages
Overview of Query Evaluation: R&G Chapter 12
No ratings yet
Overview of Query Evaluation: R&G Chapter 12
30 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
CS2221 PROGRAMMING IN C Syllabus
No ratings yet
CS2221 PROGRAMMING IN C Syllabus
2 pages
AMS Non-Disclosure Agreement v1
No ratings yet
AMS Non-Disclosure Agreement v1
1 page
Final DBMS Unit 7
No ratings yet
Final DBMS Unit 7
48 pages
Query Processing
No ratings yet
Query Processing
39 pages
ADBMS Notes
67% (3)
ADBMS Notes
48 pages
05 - Strategies For Query Processing (Ch18)
No ratings yet
05 - Strategies For Query Processing (Ch18)
50 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
Canvas Manual Student 2023
No ratings yet
Canvas Manual Student 2023
3 pages
Ch12-Query Processing
No ratings yet
Ch12-Query Processing
34 pages
Chapter 2 Querry Proccessing
No ratings yet
Chapter 2 Querry Proccessing
7 pages
L1000A TM EN TOEP C710616 134G 6 0 Addendum
No ratings yet
L1000A TM EN TOEP C710616 134G 6 0 Addendum
106 pages
PDF-3 SRT - Files - PKJ
No ratings yet
PDF-3 SRT - Files - PKJ
11 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Query Proc Notes
No ratings yet
Query Proc Notes
10 pages
ADB Chapter 2 DB Part1
No ratings yet
ADB Chapter 2 DB Part1
10 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Module - 4
No ratings yet
Module - 4
60 pages
Enhanced Implicit Sentiment Understanding With Prototype Learning and Demonstration For Aspect-Based Sentiment Analysis
No ratings yet
Enhanced Implicit Sentiment Understanding With Prototype Learning and Demonstration For Aspect-Based Sentiment Analysis
16 pages
CH 13 Updated
No ratings yet
CH 13 Updated
30 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
127 pages
Ex 6
No ratings yet
Ex 6
7 pages
05 QueryProcessing LecW4 Feb7 22
No ratings yet
05 QueryProcessing LecW4 Feb7 22
55 pages
Advanced Database Ch2 and 3
100% (1)
Advanced Database Ch2 and 3
73 pages
Query Processing
No ratings yet
Query Processing
5 pages
4.6 Algorithms For Select and Join Operations
No ratings yet
4.6 Algorithms For Select and Join Operations
6 pages
2 Select Optimization
No ratings yet
2 Select Optimization
23 pages
Ivunit Query Processing
No ratings yet
Ivunit Query Processing
12 pages
Rdbms Assignment
No ratings yet
Rdbms Assignment
12 pages
Chapter ONE
No ratings yet
Chapter ONE
48 pages
Chapter 2 Query Processing and Optimization (Autosaved)
No ratings yet
Chapter 2 Query Processing and Optimization (Autosaved)
35 pages
AIML Manual - Merged
No ratings yet
AIML Manual - Merged
41 pages
Cs3491 - Aiml Lab Record
No ratings yet
Cs3491 - Aiml Lab Record
26 pages
Question Bank
No ratings yet
Question Bank
18 pages
Khan Sir OP
No ratings yet
Khan Sir OP
1 page
Module 4
No ratings yet
Module 4
8 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
54 pages
Lec 7 Query Processing, Optimization & Indexing
No ratings yet
Lec 7 Query Processing, Optimization & Indexing
29 pages
Lecture11 Query Processing
No ratings yet
Lecture11 Query Processing
37 pages
Unit IV Part II
No ratings yet
Unit IV Part II
37 pages
Ch1 Query Processing
No ratings yet
Ch1 Query Processing
49 pages
Advanced Database System Chapter Two Query Processing and Optimization
No ratings yet
Advanced Database System Chapter Two Query Processing and Optimization
50 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
64 pages
3 Query Processing and Optimization-1
No ratings yet
3 Query Processing and Optimization-1
18 pages
Databases LEVEL 3 Notes
No ratings yet
Databases LEVEL 3 Notes
29 pages

Dbms Seminar

Uploaded by

Dbms Seminar

Uploaded by

UNIT - 5

E.g., salary75000(salary(instructor)) is equivalent

 Consider total rows as 1000 and number of

• Linear Search: Scan each file block and test all

Index: An index in DBMS contains key values

i) Using primary key:

ii) Using secondary key:

Primary – single :  (O_ID =3) (Order)

Secondary – single :  (C_ID = 1002) (Order)

 Conjunctive Select using Individual Index – The

 Conjunctive Select using Composite Index – The

 Conjunctive Select using Record Pointers – The

 Disjunctive selection by union of identifiers - The

4. Disjunctive selection by union of identifiers:

3 Charlie 101 102 IT

1. Nested Loop Join (Brute Force) :

Dept_ID Pointer to Employee Row

103 (4, David, 103)

3. Sort-Merge Join (SMJ):

 All joins produce the same result—only

 Nested Loop: Simple but slow.

 Index Nested Loop: Faster if an index is

 Sort-Merge: Efficient if data is already

 Hash Join: Best for large datasets with

iii) Computation Cost

v) Communication Cost (For Distributed Databases)

You might also like