0% found this document useful (0 votes)

292 views42 pages

ADB - Unit - III (Chapter-2) - Query Processing and Decomposition

Uploaded by

Tapaswini Desaboina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

292 views42 pages

ADB - Unit - III (Chapter-2) - Query Processing and Decomposition

Uploaded by

Tapaswini Desaboina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Advanced Databases

UNIT: III (Chapter-2)

Query Processing and
Decomposition
Reference:
Chapter – 6 & 7
Principles of Distributed Database Systems, M.Tamer Ozsu, Patrick Valduriez, 3rd Edition,
Springer

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/1

Outline
• Objectives of Query Processing
• Characterization of query processors
• Layers of query processing
• Query decomposition
• Localization of distributed data

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/2

Query Processing in a DDBMS
high level user query

query
processor

Low-level data manipulation

commands for D-DBMS

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/3

Query Processing Components
• Query language that is used

➡ SQL

• Query execution methodology

➡ The steps that one goes through in executing high-level (declarative) user
queries.

• Query optimization

➡ How do we determine the “best” execution plan?

• We assume a homogeneous D-DBMS

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/4

Selecting Alternatives

EMP(ENO, ENAME, TITLE)

ASG(ENO, PNO, RESP, DUR)

SELECT ENAME
FROM EMP,ASG
WHERE EMP.ENO = ASG.ENO
AND RESP = "Manager“

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/5

Selecting Alternatives
EMP(ENO, ENAME, TITLE)
SELECT ENAME
ASG(ENO, PNO, RESP, DUR)
FROM EMP,ASG
WHERE EMP.ENO = ASG.ENO
AND RESP = "Manager"

Strategy 1
ENAME(RESP=“Manager”EMP.ENO=ASG.ENO(EMP×ASG))
Strategy 2
 ENAME(EMP ⋈ENO (RESP=“Manager” (ASG))

Strategy 2 avoids Cartesian product, so may be “better”

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/6

What is the Problem?
Site 1 Site 2 Site 3 Site 4 Site 5
ASG1=ENO≤“E3”(ASG) ASG2= ENO>“E3”(ASG) EMP1= ENO≤“E3”(EMP) EMP2= ENO>“E3”(EMP) Result

Site 5
Strategy-B
Strategy-A Site 5
result  EMP1'  EMP2' result= (EMP1 × EMP2)⋈ENOσRESP=“Manager”(ASG1× ASG2)

EMP1' EMP2'
Site 3 Site 4 ASG1 ASG2 EMP1 EMP2
EMP’1=EMP1 ⋈ENO ASG’1 EMP’2=EMP2 ⋈ENO ASG’2
Site 1 Site 2 Site 3 Site 4

ASG 1' ASG '2

Site 1 Site 2
ASG 1'  σ RESP "Manager"ASG 1 ASG '2  σ RESP  "Manager"ASG 2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/7

Cost of Alternatives
• Assume
➡ size(EMP) = 400, size(ASG) = 1000 20 Managers in ASG
➡ tuple access cost = 1 unit tuple transfer cost = 10 units
• Strategy-A
➡ produce ASG': (10+10)  tuple access cost =20
➡ transfer ASG' to the sites of EMP: (10+10)  tuple transfer cost=200
➡ produce EMP': (10+10)  tuple access cost  2=40
➡ transfer EMP' to result site: (10+10)  tuple transfer cost= 200
Total Cost 460
• Strategy-B
➡ transfer EMP to site 5: 400  tuple transfer cost 4,000
➡ transfer ASG to site 5: 1000  tuple transfer cost 10,000
➡ produce ASG': 1000  tuple access cost 1,000
➡ join EMP and ASG': 400  20  tuple access cost 8,000
Total Cost 23,000
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/8
Objectives of Query Processing
• Minimize a cost function
I/O cost + CPU cost + Communication cost
These might have different weights in different distributed environments
• Wide area networks
➡ Communication cost may dominate or vary much
✦ bandwidth
✦ speed
✦ high protocol overhead
• Local area networks
➡ communication cost not that dominant
➡ total cost function should be considered
• Can also maximize throughput

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/9

Characteristics of Query Processors

• Languages
• Types of Optimizers
• Optimization Timing
• Statistics
• Decision Sites
• Network Topology
• Replicated Fragments
• Use of Semijoins
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/10
Characteristics of Query Processors
Complexity of Relational Operations

Operation Complexity

Select
• Assume Project O(n)
(without duplicate elimination)
➡ relations of cardinality n
➡ sequential scan Project
(with duplicate elimination) O(n  log n)
Group

Join
Semi-join O(n  log n)
Division
Set Operators

Cartesian Product O(n2)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/11

Characteristics of Query Processors
Types of Optimization
• Exhaustive search – all possible execution strategies are considered
➡ Cost-based

➡ Optimal

➡ Combinatorial complexity in the number of relations

• Heuristics
➡ Not optimal

➡ Regroup common sub-expressions

➡ Perform selection, projection first

➡ Replace a join by a series of semijoins

➡ Reorder operations to reduce intermediate relation size

➡ Optimize individual operations

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/12

Characteristics of Query Processors
Optimization Granularity
• Single query at a time

➡ Cannot use common intermediate results

• Multiple queries at a time

➡ Efficient if many similar queries

➡ Decision space is much larger

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/13

Characteristics of Query Processors
Optimization Timing
• Static
➡ Compilation  optimize prior to the execution
➡ Difficult to estimate the size of the intermediate results. error
propagation
➡ Can amortize over many executions
• Dynamic
➡ Run time optimization
➡ Exact information on the intermediate relation sizes
➡ Have to reoptimize for multiple executions
➡ Distributed INGRES
• Hybrid
➡ Compile using a static algorithm
➡ If the error in estimate sizes > threshold, reoptimize at run time
➡ Mermaid

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/14

Characteristics of Query Processors
Statistics
• Relation
➡ Cardinality
➡ Size of a tuple
➡ Fraction of tuples participating in a join with another relation
• Attribute
➡ Cardinality of domain
➡ Actual number of distinct values
• Common assumptions
➡ Independence between different attribute values
➡ Uniform distribution of attribute values within their domain

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/15

Characteristics of Query Processors
Decision Sites
• Centralized
➡ Single site determines the “best” schedule
➡ Simple
➡ Need knowledge about the entire distributed database

• Distributed
➡ Cooperation among sites to determine the schedule
➡ Need only local information
➡ Cost of cooperation

• Hybrid
➡ One site determines the global schedule
➡ Each site optimizes the local subqueries

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/16

Characteristics of Query Processors
Network Topology
• Wide area networks (WAN) – point-to-point
➡ Characteristics
✦ Low bandwidth
✦ Low speed
✦ High protocol overhead
➡ Communication cost will dominate; ignore all other cost factors
➡ Global schedule to minimize communication cost
➡ Local schedules according to centralized query optimization

• Local area networks (LAN)

➡ Communication cost not that dominant
➡ Total cost function should be considered
➡ Broadcasting can be exploited (joins)
➡ Special algorithms exist for star networks

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/17

Layering scheme for Distributed
Query Processing
Calculus Query on Distributed Relations

Query GLOBAL
Decomposition SCHEMA

Algebraic Query on Distributed

Relations
CONTROL
Data FRAGMENT
SITE Localization SCHEMA

Fragment Query

Global STATS ON
Optimization FRAGMENTS

Optimized Fragment Query

with Communication Operations

LOCAL Local LOCAL

Optimization SCHEMAS
SITES

Optimized Local Queries

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/18
Query Decomposition
&
Localization of distributed data

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/19

Query Decomposition
Input : Calculus query on global relations
• Normalization
➡ manipulate query quantifiers and qualification
• Semantic Analysis
➡ detect and reject “incorrect” queries
➡ possible for only a subset of relational calculus
• Simplification
➡ eliminate redundant predicates
• Restructuring
➡ calculus query  algebraic query
➡ more than one translation is possible
➡ use transformation rules

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/20

Normalization
• Lexical and syntactic analysis
➡ check validity (similar to compilers)

➡ check for attributes and relations

➡ type checking on the qualification

• Put into normal form

➡ Conjunctive normal form

(p11 p12  …  p1n)  …  (pm1  pm2  …  pmn)

➡ Disjunctive normal form

(p11  p12  …  p1n)  …  (pm1  pm2  …  pmn)

➡ OR's mapped into union

➡ AND's mapped into join or selection

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/21
Semantic Analysis
• prove incorrect queries
• Type incorrect
➡ If any of its attribute or relation names are not defined in the global schema
➡ If operations are applied to attributes of the wrong type
• Semantically incorrect
➡ Components do not contribute in any way to the generation of the result
➡ Only a subset of relational calculus queries can be tested for correctness
➡ Those that do not contain disjunction and negation
➡ To detect
✦ connection graph (query graph)
✦ join graph

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/22

Semantic Analysis – Example
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND ASG.PNO = PROJ.PNO
AND PNAME = "CAD/CAM"
AND DUR ≥ 36
AND TITLE = "Programmer"

Query graph Join graph

DUR≥36

ASG ASG
EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO

TITLE =
EMP RESP PROJ EMP PROJ
“Programmer”

ENAME
RESULT
PNAME=“CAD/CAM”

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/23

Semantic Analysis
If the query graph is not connected, the query may be wrong or
use Cartesian product
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND PNAME = "CAD/CAM"
AND DUR > 36
AND TITLE = "Programmer"

ASG

EMP RESP PROJ

ENAME
RESULT
PNAME=“CAD/CAM”

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/24

Simplification
• Why simplify?

➡ Remember the example

• How? Use transformation rules

➡ Elimination of redundancy
✦ idempotency rules
p1  ¬( p1)  false
p1  (p1p2)  p1
p1  false  p1
…
➡ Application of transitivity
➡ Use of integrity rules

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/25

Simplification – Example
SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"
OR (NOT(EMP.TITLE = "Programmer")
AND (EMP.TITLE = "Programmer"
OR EMP.TITLE = "Elect. Eng.")
AND NOT(EMP.TITLE = "Elect. Eng."))


SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/26

Restructuring
• Convert relational calculus to relational ENAME Project
algebra
• Make use of query trees σDUR=12 OR DUR=24
• Example
Find the names of employees other than
J. Doe who worked on the CAD/CAM
σPNAME=“CAD/CAM” Select
project for either 1 or 2 years.
SELECT ENAME σENAME≠“J. DOE”
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO ⋈PNO
AND ASG.PNO = PROJ.PNO
AND ENAME≠ "J. Doe" ⋈ENO Join
AND PNAME = "CAD/CAM"
AND (DUR = 12 OR DUR = 24) PROJ ASG EMP
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/27
Example
Recall the previous example: ENAME
Project
Find the names of employees other
than J. Doe who worked on the DUR=12  DUR=24
CAD/CAM project for either one or
two years.
PNAME=“CAD/CAM” Select
SELECT ENAME
FROM PROJ, ASG, EMP ENAME≠“J. DOE”
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO ⋈PNO
AND ENAME ≠ "J. Doe"
AND PROJ.PNAME="CAD/CAM" ⋈ENO Join

AND (DUR=12 OR DUR=24)

PROJ ASG EMP
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/28
Equivalent Query
ENAME

PNAME=“CAD/CAM”  (DUR=12  DUR=24) ENAME≠“J. Doe”

⋈PNO,ENO

EMP PROJ ASG

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/29
Data Localization
Input: Algebraic query on distributed relations
• Determine which fragments are involved
• Localization program
➡ substitute for each global query its materialization program

➡ optimize

Example
Assume ENAME

➡ EMP is fragmented into EMP1, EMP2, DUR=12 DUR=24

EMP3 as follows:
✦ EMP1= ENO≤“E3”(EMP) PNAME=“CAD/CAM”
✦ EMP2= “E3”<ENO≤“E6”(EMP)
ENAME≠“J. DOE”
✦ EMP3= ENO≥“E6”(EMP)
➡ ASG fragmented into ASG1 and ASG2 ⋈PNO
as follows:
✦ ASG1= ENO≤“E3”(ASG) ⋈ENO

✦ ASG2= ENO>“E3”(ASG) PROJ  

Replace EMP by (EMP1  EMP2  EMP3)
and ASG by (ASG1  ASG2) in any query EMP1EMP2 EMP3 ASG1 ASG2
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/31
Provides Parallellism

⋈ENO ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG1 EMP3 ASG2

Eliminates Unnecessary Work

⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Reduction for PHF
• Reduction with selection
➡ Relation R and FR={R1, R2, …, Rw} where Rj=p (R)
j

pi(Rj)= if x in R: ¬(pi(x) pj(x))

➡ Example
SELECT *
FROM EMP
WHERE ENO="E5"
ENO=“E5” ENO=“E5”

EMP1 EMP2 EMP3 EMP2

➡ Possible if fragmentation is done on join attribute

➡ Distribute join over union

(R1 R2)⋈S  (R1⋈S)  (R2⋈S)

➡ Given Ri =p (R) and Rj = p (R)

i j

Ri ⋈Rj = if x in Ri, y in Rj: ¬(pi(x) pj(y))

Reduction for PHF
• Assume EMP is fragmented as ⋈ENO
before and
➡ ASG1: ENO ≤ "E3"(ASG)  
➡ ASG2: ENO > "E3"(ASG)
• Consider the query EMP1 EMP2 EMP3 ASG1 ASG2
SELECT *
FROM EMP,ASG
WHERE EMP.ENO=ASG.ENO 
• Distribute join over unions
• Apply the reduction rule ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Reduction for VF
• Find useless (not empty) intermediate relations

Relation R defined over attributes A = {A1, ..., An} vertically fragmented

as Ri =A'(R) where A' A:
D,K(Ri) is useless if the set of projection attributes D is not in A'
Example: EMP1=ENO,ENAME (EMP); EMP2=ENO,TITLE (EMP)

SELECT ENAME
FROM EMP
ENAME ENAME

⋈ENO

EMP1 EMP2 EMP1

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/37
Reduction for DHF
• Rule :
➡ Distribute joins over unions
➡ Apply the join reduction for horizontal fragmentation
• Example
ASG1: ASG ⋉ENO EMP1
ASG2: ASG ⋉ENO EMP2
EMP1: TITLE=“Programmer” (EMP)
EMP2: TITLE=“Programmer” (EMP)
• Query
SELECT *
FROM EMP, ASG
WHERE ASG.ENO = EMP.ENO
AND EMP.TITLE = "Mech. Eng."
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/38
Reduction for DHF
Generic query ⋈ENO
TITLE=“Mech. Eng.”

 

ASG1 ASG2 EMP1 EMP2

Selections first ⋈ENO

 TITLE=“Mech. Eng.”

ASG1 ASG2 EMP2

⋈ENO ⋈ENO

TITLE=“Mech. Eng.” TITLE=“Mech. Eng.”

ASG1 EMP2 ASG2 EMP2

Elimination of the empty intermediate relations
(left sub-tree) ⋈ENO

TITLE=“Mech. Eng.”

ASG2 EMP2
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/40
Reduction for Hybrid Fragmentation
• Combine the rules already specified:
➡ Remove empty relations generated by contradicting selections on horizontal
fragments;

➡ Remove useless relations generated by projections on vertical fragments;

➡ Distribute joins over unions in order to isolate and remove useless joins.

Reduction for Hybrid Fragmentation
Example
ENAME
Consider the following hybrid
fragmentation: ENAME
ENO=“E5”
EMP1= ENO≤"E4" (ENO,ENAME (EMP))

EMP2= ENO>"E4" (ENO,ENAME (EMP))

⋈ENO
 ENO=“E5”
EMP3= ENO,TITLE (EMP)

and the query 

EMP2
SELECT ENAME
FROM EMP
WHERE ENO="E5" EMP1 EMP2 EMP3

PHP Lab - Iv Sem - Bca
No ratings yet
PHP Lab - Iv Sem - Bca
16 pages
Ad3391 LAB MANUAL
No ratings yet
Ad3391 LAB MANUAL
23 pages
Business Organization and Systems
No ratings yet
Business Organization and Systems
7 pages
03 Chapter 15 Algorithms For Query Processing Optimization
No ratings yet
03 Chapter 15 Algorithms For Query Processing Optimization
35 pages
Online Loan Application and Verification 2016
No ratings yet
Online Loan Application and Verification 2016
34 pages
Comptia A 220 1201 Exam Objectives (2 0)
No ratings yet
Comptia A 220 1201 Exam Objectives (2 0)
18 pages
Operating System Questions and Answerss
No ratings yet
Operating System Questions and Answerss
46 pages
Static and Dynamic Hashing
No ratings yet
Static and Dynamic Hashing
12 pages
Database Management System MCQS: View Answer
No ratings yet
Database Management System MCQS: View Answer
1 page
IBPS SO Study Material PDF Ebooks - Download For Free
No ratings yet
IBPS SO Study Material PDF Ebooks - Download For Free
15 pages
Unit 1: Advanced Computer Networks & Security
No ratings yet
Unit 1: Advanced Computer Networks & Security
64 pages
NIST Cloud Computing Reference Architecture
No ratings yet
NIST Cloud Computing Reference Architecture
21 pages
PPL PPT - 1
No ratings yet
PPL PPT - 1
537 pages
Query Processing
No ratings yet
Query Processing
121 pages
Daily Wisdom: Sayings of The Prophet Muhammad by Abdur Raheem Kidwai
No ratings yet
Daily Wisdom: Sayings of The Prophet Muhammad by Abdur Raheem Kidwai
7 pages
Data Warehouse Schemas For Decision Support
No ratings yet
Data Warehouse Schemas For Decision Support
13 pages
E Invoicing Guidelines 2024
No ratings yet
E Invoicing Guidelines 2024
28 pages
Generalization/Specialization: Database Systems
No ratings yet
Generalization/Specialization: Database Systems
15 pages
Proposed-Data Structure Syllabus
No ratings yet
Proposed-Data Structure Syllabus
3 pages
AG7 Access+resource+secrets+more+securely+across+services Ed1
No ratings yet
AG7 Access+resource+secrets+more+securely+across+services Ed1
55 pages
Lesson 8 DNS Networking CoreDNS and CNI
No ratings yet
Lesson 8 DNS Networking CoreDNS and CNI
54 pages
MSTest Vs NUnit
No ratings yet
MSTest Vs NUnit
4 pages
Webtech Akshay 16137mailvalidatoin
No ratings yet
Webtech Akshay 16137mailvalidatoin
15 pages
Dbms Merged Its Makaut Previous Year Question Set
No ratings yet
Dbms Merged Its Makaut Previous Year Question Set
67 pages
Domain Testing
No ratings yet
Domain Testing
12 pages
Maths in Data Science
No ratings yet
Maths in Data Science
3 pages
TM256 Revision
No ratings yet
TM256 Revision
68 pages
Implementing Vpns
No ratings yet
Implementing Vpns
12 pages
BFS, Stacks & Queue Data Structure
No ratings yet
BFS, Stacks & Queue Data Structure
10 pages
Agilent Vacuum Station
No ratings yet
Agilent Vacuum Station
8 pages
12 Acers
No ratings yet
12 Acers
3 pages
Compiler Design Final Question Bank
No ratings yet
Compiler Design Final Question Bank
5 pages
Ec8691 PDF Book PDF
No ratings yet
Ec8691 PDF Book PDF
137 pages
Advanced Computer Networks Question Paper
No ratings yet
Advanced Computer Networks Question Paper
6 pages
Mid Term Past Paper 1
No ratings yet
Mid Term Past Paper 1
9 pages
Solutions Part I - Logistic Regression Backpropagation With A Single Training Example
No ratings yet
Solutions Part I - Logistic Regression Backpropagation With A Single Training Example
6 pages
Model Question Paper-1 Subject: Microcontroller (18EE52) Semester: 5th
No ratings yet
Model Question Paper-1 Subject: Microcontroller (18EE52) Semester: 5th
4 pages
Codevita PDF
100% (1)
Codevita PDF
80 pages
English Us Maxpro Ssa Faqs-PDF-En-US-1
No ratings yet
English Us Maxpro Ssa Faqs-PDF-En-US-1
6 pages
Salesforce Data Loader Guide
No ratings yet
Salesforce Data Loader Guide
57 pages
Lvpei BD Call 090321
No ratings yet
Lvpei BD Call 090321
1 page
Query Optimization MCQ
No ratings yet
Query Optimization MCQ
12 pages
Class 11 English Guess (2025) - 1
75% (4)
Class 11 English Guess (2025) - 1
20 pages
6-Query Intro
No ratings yet
6-Query Intro
15 pages
Sapphire Structure Truss
No ratings yet
Sapphire Structure Truss
136 pages
TOC Viva Questions
No ratings yet
TOC Viva Questions
1 page
Spring Cloud Dataflow Reference
No ratings yet
Spring Cloud Dataflow Reference
130 pages
Outline: Distributed Query Processing
No ratings yet
Outline: Distributed Query Processing
8 pages
DCCN Notes
No ratings yet
DCCN Notes
26 pages
Chapter Three
No ratings yet
Chapter Three
37 pages
Januarius T. Manipol - Profile - PDF - 03152024
No ratings yet
Januarius T. Manipol - Profile - PDF - 03152024
4 pages
Requirements Modeling
No ratings yet
Requirements Modeling
39 pages
Unit 1 Web Technology: Introduction and Web Development Strategies
No ratings yet
Unit 1 Web Technology: Introduction and Web Development Strategies
29 pages
Hbase PPT PDF
No ratings yet
Hbase PPT PDF
100 pages
Olx's Presentation
No ratings yet
Olx's Presentation
26 pages
Sample Questions On Distributed Database-1
No ratings yet
Sample Questions On Distributed Database-1
3 pages
Cs2253 - Computer Architecture 16 Marks Question Bank With Hints Unit - I 1. Explain Basic Functional Units of Computer. Input Unit
No ratings yet
Cs2253 - Computer Architecture 16 Marks Question Bank With Hints Unit - I 1. Explain Basic Functional Units of Computer. Input Unit
18 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
CCFP4 0-RDBMSAssignments
No ratings yet
CCFP4 0-RDBMSAssignments
64 pages
Simatic Net: Rugged Ethernet Switches
No ratings yet
Simatic Net: Rugged Ethernet Switches
48 pages
WTA (Wireless Telephony Application)
No ratings yet
WTA (Wireless Telephony Application)
6 pages
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
12 pages
Introduction To Data Science - Unit-1
No ratings yet
Introduction To Data Science - Unit-1
9 pages
Practical - 5 PDF
No ratings yet
Practical - 5 PDF
6 pages
18CS72
No ratings yet
18CS72
2 pages
Software Engineering NTA UGC NET Question Analysis PART2
No ratings yet
Software Engineering NTA UGC NET Question Analysis PART2
16 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Software Testing Methodologcompletenotes
No ratings yet
Software Testing Methodologcompletenotes
147 pages
Programming Language Design Issues
No ratings yet
Programming Language Design Issues
47 pages
ADBMS Lab Manual
No ratings yet
ADBMS Lab Manual
26 pages
Intro To Oas
No ratings yet
Intro To Oas
31 pages
List of Cse
No ratings yet
List of Cse
13 pages
CrackMapExec and NetExec Cheat Sheet
No ratings yet
CrackMapExec and NetExec Cheat Sheet
10 pages
Dbms Lab Exam
0% (2)
Dbms Lab Exam
13 pages
500 Question 2
No ratings yet
500 Question 2
157 pages
DBMS Lab (18IS507) Manual With Solutions-1
No ratings yet
DBMS Lab (18IS507) Manual With Solutions-1
24 pages
ADBMS Sem 1 Mumbai University (MSC - CS)
No ratings yet
ADBMS Sem 1 Mumbai University (MSC - CS)
39 pages
1.write A Program in Prolog To Show The Sum of N Natural Numbers. Code
No ratings yet
1.write A Program in Prolog To Show The Sum of N Natural Numbers. Code
2 pages
OOP - I GTU Study Material Presentations Unit-1 07022022102854PM
No ratings yet
OOP - I GTU Study Material Presentations Unit-1 07022022102854PM
59 pages
250+ TOP MCQs On Database Design Process and Answers
No ratings yet
250+ TOP MCQs On Database Design Process and Answers
7 pages
DAA Unit - 1
No ratings yet
DAA Unit - 1
68 pages
OOAD MCQ: Basic Behavioral Modeling II
No ratings yet
OOAD MCQ: Basic Behavioral Modeling II
6 pages
Simple and Complex Query
No ratings yet
Simple and Complex Query
6 pages
Practice Assignment Networking
No ratings yet
Practice Assignment Networking
3 pages
Principles of Programming Languages
No ratings yet
Principles of Programming Languages
6 pages
PHP Variables
No ratings yet
PHP Variables
4 pages
Anna University OOPS Question Bank Unit 2
100% (1)
Anna University OOPS Question Bank Unit 2
6 pages
Bput Coa
No ratings yet
Bput Coa
2 pages
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
No ratings yet
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
12 pages
16 Mark Questions OOAD
100% (2)
16 Mark Questions OOAD
9 pages

ADB - Unit - III (Chapter-2) - Query Processing and Decomposition

Uploaded by

ADB - Unit - III (Chapter-2) - Query Processing and Decomposition

Uploaded by

Advanced Databases

UNIT: III (Chapter-2)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/1

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/2

Low-level data manipulation

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/3

• Query execution methodology

➡ How do we determine the “best” execution plan?

• We assume a homogeneous D-DBMS

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/4

EMP(ENO, ENAME, TITLE)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/5

Strategy 2 avoids Cartesian product, so may be “better”

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/6

ASG 1' ASG '2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/7

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/9

Cartesian Product O(n2)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/11

➡ Combinatorial complexity in the number of relations

➡ Regroup common sub-expressions

➡ Perform selection, projection first

➡ Replace a join by a series of semijoins

➡ Reorder operations to reduce intermediate relation size

➡ Optimize individual operations

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/12

➡ Cannot use common intermediate results

• Multiple queries at a time

➡ Efficient if many similar queries

➡ Decision space is much larger

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/13

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/14

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/15

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/16

• Local area networks (LAN)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/17

Algebraic Query on Distributed

Optimized Fragment Query

LOCAL Local LOCAL

Optimized Local Queries

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/19

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/20

➡ check for attributes and relations

➡ type checking on the qualification

• Put into normal form

(p11 p12  …  p1n)  …  (pm1  pm2  …  pmn)

(p11  p12  …  p1n)  …  (pm1  pm2  …  pmn)

➡ AND's mapped into join or selection

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/22

Query graph Join graph

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/23

EMP RESP PROJ

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/24

➡ Remember the example

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/25

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/26

AND (DUR=12 OR DUR=24)

PNAME=“CAD/CAM”  (DUR=12  DUR=24) ENAME≠“J. Doe”

EMP PROJ ASG

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/30

➡ EMP is fragmented into EMP1, EMP2, DUR=12 DUR=24

✦ ASG2= ENO>“E3”(ASG) PROJ  

⋈ENO ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG1 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/32

⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/33

pi(Rj)= if x in R: ¬(pi(x) pj(x))

EMP1 EMP2 EMP3 EMP2

➡ Possible if fragmentation is done on join attribute

➡ Distribute join over union

(R1 R2)⋈S  (R1⋈S)  (R2⋈S)

➡ Given Ri =p (R) and Rj = p (R)

Ri ⋈Rj = if x in Ri, y in Rj: ¬(pi(x) pj(y))

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/35