0% found this document useful (0 votes)
21 views24 pages

6 - Query Processing Updated

1. The document discusses distributed query processing in a horizontally and vertically fragmented database across multiple sites. 2. Query processing strategies are explored including shipping relations, semi-joins, and performing joins locally at each site to minimize network communication costs. 3. An example query joining the Employee and Department tables stored across two sites is used to compare the costs of different query execution strategies like shipping all relations versus performing semi-joins.

Uploaded by

fatima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views24 pages

6 - Query Processing Updated

1. The document discusses distributed query processing in a horizontally and vertically fragmented database across multiple sites. 2. Query processing strategies are explored including shipping relations, semi-joins, and performing joins locally at each site to minimize network communication costs. 3. An example query joining the Employee and Department tables stored across two sites is used to compare the costs of different query execution strategies like shipping all relations versus performing semi-joins.

Uploaded by

fatima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

CIS 1902435 DDBMS

Query Processing

Chapter 7
Dr. Bassam Hammo
Distributed Query Processing
For centralized DBM systems, the primary
performance concern is to minimize the number
of disk accesses.

In a distributed system, we also need to


minimize the communication cost among sites.

Distributed query processing involves:


accessing the data stored locally, and
accessing the data in other sites (Our focus)

2
Querying Horizontally Fragmented Relations
SELECT Balance FROM Branch WHERE Branch-Name=“Amman”
Only Site1 will be contacted

Branch-Name Account-Number Balance


Amman A-100 500
Site 1
Amman A-220 336
Amman A-155 75
Branch1 = σ Branch-Name = “Amman” (Branch)

Branch-Name Account-Number Balance


Zarqa A-175 210
Site 2 Zarqa A-420 1000
Zarqa A-185 1127

Branch2 = σ Branch-Name = “Zarqa” (Branch) 3


Querying Vertically Fragmented Relations
Assume the complete Account schema is
(Account-ID, Customer-Name, Balance)
Query: SELECT Balance FROM Account
Contact only Site2.

Account-ID Customer-Name Account-ID Balance


1 Ahmad 1 10000
2 Mohammad 2 50000
3 Bassam 3 15000
… … … …

Site 1 Site 2 4
Join Processing

Assume we have 3 relations R1, R2, R3, stored


in sites S1, S2, S3, respectively.

Query: SELECT * FROM R1, R2, R3, WHERE <join conditions>

Assume the query is issued at site S1


The system needs to display the results at site
S1
5
Simple Strategies
Query: SELECT * FROM R1, R2, R3, WHERE <join conditions>

Strategy 1: Ship copies of all three relations to S1


S1 S2 S3
R1 R1 R2 R2 R3 R3

Cost = network (size of R2 + size of R3) + cost of join

6
Strategy 2 : Ship R3 to S2
Compute temp1 = R2 R3
Ship temp1 from S2 to S1
Compute temp2 = temp1 R1 at S1

S1 S2 S3
R1 R1 R2 R2 R3 R3

A
temp2 temp1

Cost = network (size of R3 + size of temp1) + cost of 2 joins

7
Strategy 3 : Semi Join
Query: SELECT * FROM R1, R2 WHERE R1.id = R2.id
- Issued @ S1
- Ship R2 to S1 and perform the join at S1
Cost = network (12
integers) + join
R1 R2

S
1

S2 8
Strategy 4 : Semi Join (Continue)
Cost = network (3 + 2 = 5 integers) +
cost of 2 joins

R1 temp1 R2

S1
temp2

S2
9
Semi Join
In general, to john R1 (at site S1) and R2 (at
site S2)
– Get the common columns of R1 and R2
– Project R1 to temp1 on these columns
– Ship temp1 to S2
– Join temp1 and R2 (let the result be temp2)
– Ship temp2 to S1
– Join R1 with temp2
10
Example
Employee Table Department Table

SSN, Fname, Lname, Dno Dno, Dname, MSSN


10 20 20 10 10 15 10

1000 records 100 records


60 bytes per record 35 bytes per record

S1 Query Site
S2
Select
Fname, Lname, Dname
From
Department, Employee
Where
Department.Dno = Employee.Dno

S3
11
Example
Find the “minimum cost” for answering this query

This query will return 1000 records (every


employee works for one department)

Each record will be 55 bytes long (FNAME +


LNAME + DNAME = 20 + 20 + 15 = 55)

Thus the result set will be 1000 * 55 = 55,000


bytes.
12
Example
Three alternatives:

1. Copy all EMPLOYEE and


DEPARTMENT records to S3. Perform
the join and display the results.

13
Employee Table Department Table

SSN, Fname, Lname, Dno Dno, Dname, MSSN


10 20 20 10 10 15 10

1000 records 100 records


60 bytes per record 35 bytes per record

S1 Query Site
S2
60,000 Select 3,500
bytes Fname, Lname, Dname bytes
From Department Table
Employee Table
Department, Employee
Where 100 records
1000 records
Department.Dno = Employee.Dno 35 bytes per record
60 bytes per record

S3

63,500 14
Example

2. Copy all EMPLOYEE records from S1 to


S2. Execute the query at S2 then ship the
results to S3.

15
60,000

Query Exec
Employee Table
bytes
1000 records
60 bytes per record

Employee Table Department Table

SSN, Fname, Lname, Dno Dno, Dname, MSSN


10 20 20 10 10 15 10

1000 records 100 records


60 bytes per record 35 bytes per record

S1 Query Site
S2
55,000 Select
Fname, Lname, Dname
bytes From 60,000 +
Result Set Department, Employee
Where 55,000 =
1000 records
55 bytes per record
Department.Dno = Employee.Dno 115,000
bytes
S3 16
Example

3. Copy all DEPARTMENT records from S2


to S1. Execute the query at S1 then ship
the results to S3.

17
Query Exec
Department Table 3,500
100 records
bytes
35 bytes per record

Employee Table Department Table

SSN, Fname, Lname, Dno Dno, Dname, MSSN


10 20 20 10 10 15 10

1000 records 100 records


60 bytes per record 35 bytes per record

S1 Query Site
S2
Select
Fname, Lname, Dname 55,000
From bytes
Department, Employee Result Set
3,500 + Where
55,000 = Department.Dno = Employee.Dno 1000 records
55 bytes per record
58,500
S3 18
Semi-Join

Now let’s do the same example


with semi-join

19
Employee Table Department Table

SSN, Fname, Lname, Dno Dno, Dname, MSSN


10 20 20 10 10 15 10

1000 records 100 records


60 bytes per record 35 bytes per record

S1 S2
Query Site

50,000 Select 2,500


Fname, Lname, Dname
bytes
Employee Table From bytes
Department Table
Department, Employee
Fname, Lname, Dno Where Dno, Dname
Department.Dno = Employee.Dno
20 20 10 10 15

1000 records
50 bytes per record
S3 100 records
25 bytes per record

52,500 20
Fname, Lname,

Query Exec
Employee Table
Dno

50,000 bytes 1000 records


50 bytes per record

Employee Table Department Table

SSN, Fname, Lname, Dno Dno, Dname, MSSN


10 20 20 10 10 15 10

1000 records 100 records


60 bytes per record 35 bytes per record

S1 Query Site
S2
55,000 Select
Fname, Lname, Dname
bytes From 50,000 +
Result Set Department, Employee
Where 55,000 =
1000 records
55 bytes per record
Department.Dno = Employee.Dno 105,000
bytes
S3 21
Query Exec Don, Dname
Department Table
2,500 bytes
100 records
25 bytes per record

Employee Table Department Table

SSN, Fname, Lname, Dno Dno, Dname, MSSN


10 20 20 10 10 15 10

1000 records 100 records


60 bytes per record 35 bytes per record

S1 Query Site
S2
Select
Fname, Lname, Dname 55,000
From bytes
Department, Employee Result Set
2,500 + Where
55,000 = Department.Dno = Employee.Dno 1000 records
55 bytes per record
57,500
S3 22
And the winner is

23
Employee Table Department Table

SSN, Fname, Lname, Dno Dno, Dname, MSSN


10 20 20 10 10 15 10

1000 records 100 records


60 bytes per record 35 bytes per record

S1 S2
Query Site

50,000 Select 2,500


Fname, Lname, Dname
bytes
Employee Table From bytes
Department Table
Department, Employee
Fname, Lname, Dno Where Dno, Dname
Department.Dno = Employee.Dno
20 20 10 10 15

1000 records
50 bytes per record
S3 100 records
25 bytes per record

52,500 24

You might also like