CIS 1902435 DDBMS
Query Processing
Chapter 7
Dr. Bassam Hammo
Distributed Query Processing
For centralized DBM systems, the primary
performance concern is to minimize the number
of disk accesses.
In a distributed system, we also need to
minimize the communication cost among sites.
Distributed query processing involves:
accessing the data stored locally, and
accessing the data in other sites (Our focus)
2
Querying Horizontally Fragmented Relations
SELECT Balance FROM Branch WHERE Branch-Name=“Amman”
Only Site1 will be contacted
Branch-Name Account-Number Balance
Amman A-100 500
Site 1
Amman A-220 336
Amman A-155 75
Branch1 = σ Branch-Name = “Amman” (Branch)
Branch-Name Account-Number Balance
Zarqa A-175 210
Site 2 Zarqa A-420 1000
Zarqa A-185 1127
Branch2 = σ Branch-Name = “Zarqa” (Branch) 3
Querying Vertically Fragmented Relations
Assume the complete Account schema is
(Account-ID, Customer-Name, Balance)
Query: SELECT Balance FROM Account
Contact only Site2.
Account-ID Customer-Name Account-ID Balance
1 Ahmad 1 10000
2 Mohammad 2 50000
3 Bassam 3 15000
… … … …
Site 1 Site 2 4
Join Processing
Assume we have 3 relations R1, R2, R3, stored
in sites S1, S2, S3, respectively.
Query: SELECT * FROM R1, R2, R3, WHERE <join conditions>
Assume the query is issued at site S1
The system needs to display the results at site
S1
5
Simple Strategies
Query: SELECT * FROM R1, R2, R3, WHERE <join conditions>
Strategy 1: Ship copies of all three relations to S1
S1 S2 S3
R1 R1 R2 R2 R3 R3
Cost = network (size of R2 + size of R3) + cost of join
6
Strategy 2 : Ship R3 to S2
Compute temp1 = R2 R3
Ship temp1 from S2 to S1
Compute temp2 = temp1 R1 at S1
S1 S2 S3
R1 R1 R2 R2 R3 R3
A
temp2 temp1
Cost = network (size of R3 + size of temp1) + cost of 2 joins
7
Strategy 3 : Semi Join
Query: SELECT * FROM R1, R2 WHERE R1.id = R2.id
- Issued @ S1
- Ship R2 to S1 and perform the join at S1
Cost = network (12
integers) + join
R1 R2
S
1
S2 8
Strategy 4 : Semi Join (Continue)
Cost = network (3 + 2 = 5 integers) +
cost of 2 joins
R1 temp1 R2
S1
temp2
S2
9
Semi Join
In general, to john R1 (at site S1) and R2 (at
site S2)
– Get the common columns of R1 and R2
– Project R1 to temp1 on these columns
– Ship temp1 to S2
– Join temp1 and R2 (let the result be temp2)
– Ship temp2 to S1
– Join R1 with temp2
10
Example
Employee Table Department Table
SSN, Fname, Lname, Dno Dno, Dname, MSSN
10 20 20 10 10 15 10
1000 records 100 records
60 bytes per record 35 bytes per record
S1 Query Site
S2
Select
Fname, Lname, Dname
From
Department, Employee
Where
Department.Dno = Employee.Dno
S3
11
Example
Find the “minimum cost” for answering this query
This query will return 1000 records (every
employee works for one department)
Each record will be 55 bytes long (FNAME +
LNAME + DNAME = 20 + 20 + 15 = 55)
Thus the result set will be 1000 * 55 = 55,000
bytes.
12
Example
Three alternatives:
1. Copy all EMPLOYEE and
DEPARTMENT records to S3. Perform
the join and display the results.
13
Employee Table Department Table
SSN, Fname, Lname, Dno Dno, Dname, MSSN
10 20 20 10 10 15 10
1000 records 100 records
60 bytes per record 35 bytes per record
S1 Query Site
S2
60,000 Select 3,500
bytes Fname, Lname, Dname bytes
From Department Table
Employee Table
Department, Employee
Where 100 records
1000 records
Department.Dno = Employee.Dno 35 bytes per record
60 bytes per record
S3
63,500 14
Example
2. Copy all EMPLOYEE records from S1 to
S2. Execute the query at S2 then ship the
results to S3.
15
60,000
Query Exec
Employee Table
bytes
1000 records
60 bytes per record
Employee Table Department Table
SSN, Fname, Lname, Dno Dno, Dname, MSSN
10 20 20 10 10 15 10
1000 records 100 records
60 bytes per record 35 bytes per record
S1 Query Site
S2
55,000 Select
Fname, Lname, Dname
bytes From 60,000 +
Result Set Department, Employee
Where 55,000 =
1000 records
55 bytes per record
Department.Dno = Employee.Dno 115,000
bytes
S3 16
Example
3. Copy all DEPARTMENT records from S2
to S1. Execute the query at S1 then ship
the results to S3.
17
Query Exec
Department Table 3,500
100 records
bytes
35 bytes per record
Employee Table Department Table
SSN, Fname, Lname, Dno Dno, Dname, MSSN
10 20 20 10 10 15 10
1000 records 100 records
60 bytes per record 35 bytes per record
S1 Query Site
S2
Select
Fname, Lname, Dname 55,000
From bytes
Department, Employee Result Set
3,500 + Where
55,000 = Department.Dno = Employee.Dno 1000 records
55 bytes per record
58,500
S3 18
Semi-Join
Now let’s do the same example
with semi-join
19
Employee Table Department Table
SSN, Fname, Lname, Dno Dno, Dname, MSSN
10 20 20 10 10 15 10
1000 records 100 records
60 bytes per record 35 bytes per record
S1 S2
Query Site
50,000 Select 2,500
Fname, Lname, Dname
bytes
Employee Table From bytes
Department Table
Department, Employee
Fname, Lname, Dno Where Dno, Dname
Department.Dno = Employee.Dno
20 20 10 10 15
1000 records
50 bytes per record
S3 100 records
25 bytes per record
52,500 20
Fname, Lname,
Query Exec
Employee Table
Dno
50,000 bytes 1000 records
50 bytes per record
Employee Table Department Table
SSN, Fname, Lname, Dno Dno, Dname, MSSN
10 20 20 10 10 15 10
1000 records 100 records
60 bytes per record 35 bytes per record
S1 Query Site
S2
55,000 Select
Fname, Lname, Dname
bytes From 50,000 +
Result Set Department, Employee
Where 55,000 =
1000 records
55 bytes per record
Department.Dno = Employee.Dno 105,000
bytes
S3 21
Query Exec Don, Dname
Department Table
2,500 bytes
100 records
25 bytes per record
Employee Table Department Table
SSN, Fname, Lname, Dno Dno, Dname, MSSN
10 20 20 10 10 15 10
1000 records 100 records
60 bytes per record 35 bytes per record
S1 Query Site
S2
Select
Fname, Lname, Dname 55,000
From bytes
Department, Employee Result Set
2,500 + Where
55,000 = Department.Dno = Employee.Dno 1000 records
55 bytes per record
57,500
S3 22
And the winner is
23
Employee Table Department Table
SSN, Fname, Lname, Dno Dno, Dname, MSSN
10 20 20 10 10 15 10
1000 records 100 records
60 bytes per record 35 bytes per record
S1 S2
Query Site
50,000 Select 2,500
Fname, Lname, Dname
bytes
Employee Table From bytes
Department Table
Department, Employee
Fname, Lname, Dno Where Dno, Dname
Department.Dno = Employee.Dno
20 20 10 10 15
1000 records
50 bytes per record
S3 100 records
25 bytes per record
52,500 24