0% found this document useful (0 votes)
39 views24 pages

Chapter 13

sorting

Uploaded by

Vijaya Goel
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views24 pages

Chapter 13

sorting

Uploaded by

Vijaya Goel
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Silberschatz, Korth and Sudarshan 13.

1 Database System Concepts - 5


th
Edition, Aug 27, 2005.
Sorting
We may build an index on the relation, and then use the index to read
the relation in sorted order. May lead to one disk block access for
each tuple.
For relations that fit in memory, techniques like quicksort can be used.
For relations that dont fit in memory, external
sort-merge is a good choice.
Silberschatz, Korth and Sudarshan 13.2 Database System Concepts - 5
th
Edition, Aug 27, 2005.
External Sort-Merge
1. Create sorted runs. Let i be 0 initially.
Repeatedly do the following till the end of the relation:
(a) Read M blocks of relation into memory
(b) Sort the in-memory blocks
(c) Write sorted data to run R
i
; increment i.
Let the final value of i be N
2. Merge the runs (next slide)..
Let M denote memory size (in pages).
Silberschatz, Korth and Sudarshan 13.3 Database System Concepts - 5
th
Edition, Aug 27, 2005.
External Sort-Merge (Cont.)
2. Merge the runs (N-way merge). We assume (for now) that N <
M.
1. Use N blocks of memory to buffer input runs, and 1 block to
buffer output. Read the first block of each run into its buffer
page
2. repeat
1. Select the first record (in sort order) among all buffer
pages
2. Write the record to the output buffer. If the output buffer
is full write it to disk.
3. Delete the record from its input buffer page.
If the buffer page becomes empty then
read the next block (if any) of the run into the buffer.
3. until all input buffer pages are empty:
Silberschatz, Korth and Sudarshan 13.4 Database System Concepts - 5
th
Edition, Aug 27, 2005.
External Sort-Merge (Cont.)
If N > M, several merge passes are required.
In each pass, contiguous groups of M - 1 runs are merged.
A pass reduces the number of runs by a factor of M -1, and
creates runs longer by the same factor.
E.g. If M=11, and there are 90 runs, one pass reduces
the number of runs to 9, each 10 times the size of the
initial runs
Repeated passes are performed till all runs have been
merged into one.
Silberschatz, Korth and Sudarshan 13.5 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Example: External Sorting Using Sort-Merge
Silberschatz, Korth and Sudarshan 13.6 Database System Concepts - 5
th
Edition, Aug 27, 2005.
External Merge Sort (Cont.)
Cost analysis:
Total number of merge passes required: log
M1
(b
r
/M)(.
Block transfers for initial run creation as well as in each
pass is 2b
r

for final pass, we dont count write cost
we ignore final write cost for all operations since the
output of an operation may be sent to the parent
operation without being written to disk
Thus total number of block transfers for external sorting:
b
r
( 2 log
M1
(b
r
/ M)( + 1)
Silberschatz, Korth and Sudarshan 13.7 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Join Operation
Several different algorithms to implement joins
Nested-loop join
Block nested-loop join
Indexed nested-loop join
Merge-join
Hash-join
Choice based on cost estimate
Examples use the following information
Number of records of customer: 10,000 depositor: 5000
Number of blocks of customer: 400 depositor: 100
Silberschatz, Korth and Sudarshan 13.8 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Nested-Loop Join
To compute the theta join r
u
s
for each tuple t
r
in r do begin
for each tuple t
s
in s do begin
test pair (t
r
,t
s
) to see if they satisfy the join condition u
if they do, add t
r
t
s
to the result.
end
end
r is called the outer relation and s the inner relation of the join.
Requires no indices and can be used with any kind of join condition.
Expensive since it examines every pair of tuples in the two relations.
Silberschatz, Korth and Sudarshan 13.9 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Nested-Loop Join (Cont.)
In the worst case, if there is enough memory only to hold one block of each
relation, the estimated cost is
n
r
- b
s
+ b
r

block transfers

If the smaller relation fits entirely in memory, use that as the inner relation.
Reduces cost to b
r
+ b
s
block transfers
Assuming worst case memory availability cost estimate is
with depositor as outer relation:
5000 - 400 + 100 = 2,000,100 block transfers,
with customer as the outer relation
10000 - 100 + 400 = 1,000,400 block transfers
If smaller relation (depositor) fits entirely in memory, the cost estimate will be 500
block transfers.
Block nested-loops algorithm (next slide) is preferable.
Silberschatz, Korth and Sudarshan 13.10 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Block Nested-Loop Join
Variant of nested-loop join in which every block of inner relation is
paired with every block of outer relation.
for each block B
r
of r do begin
for each block B
s
of s do begin
for each tuple t
r
in B
r
do begin
for each tuple t
s
in B
s
do begin
Check if (t
r
,t
s
) satisfy the join condition
if they do, add t
r

t
s
to the result.
end
end
end
end
Silberschatz, Korth and Sudarshan 13.11 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Block Nested-Loop Join (Cont.)
Worst case estimate: b
r
- b
s
+ b
r
block transfers
Each block in the inner relation s is read once for each block in
the outer relation (instead of once for each tuple in the outer
relation
Best case: b
r
+ b
s
block transfers.
Improvements to nested loop and block nested loop algorithms:
In block nested-loop, use M 2 disk blocks as blocking unit for
outer relations, where M = memory size in blocks; use remaining
two blocks to buffer inner relation and output
Cost = b
r
/ (M-2)( - b
s
+ b
r
block transfers
If equi-join attribute forms a key on inner relation, stop inner loop
on first match
Scan inner loop forward and backward alternately, to make use of
the blocks remaining in buffer (with LRU replacement)
Use index on inner relation if available (next slide)
Silberschatz, Korth and Sudarshan 13.12 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Indexed Nested-Loop Join
Index lookups can replace file scans if
join is an equi-join or natural join and
an index is available on the inner relations join attribute
Can construct an index just to compute a join.
For each tuple t
r
in the outer relation r, use the index to look up tuples in s
that satisfy the join condition with tuple t
r
.(Equivalent to selection on s)
Worst case: buffer has space for only one page of r and one page of
index. For each tuple in r, we perform an index lookup on s.
Cost of the join: b
r
+ n
r
- c
Where c is the cost of traversing index and fetching all matching s
tuples for one tuple or r
c can be estimated as cost of a single selection on s using the join
condition.
If indices are available on join attributes of both r and s,
use the relation with fewer tuples as the outer relation.
Silberschatz, Korth and Sudarshan 13.13 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Example of Nested-Loop Join Costs
Compute depositor customer, with depositor as the outer relation.
Let customer have a primary B
+
-tree index on the join attribute
customer-name, which contains 20 entries in each index node.
Since customer has 10,000 tuples, the height of the tree is 4, and one
more access is needed to find the actual data
depositor has 5000 tuples
Cost of block nested loops join
400*100 + 100 = 40,100 block transfers
assuming worst case memory
may be significantly less with more memory
Cost of indexed nested loops join
100 + 5000 * 5 = 25,100 block transfers.
CPU cost likely to be less than that for block nested loops join
Silberschatz, Korth and Sudarshan 13.14 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Merge-Join
1. Sort both relations on their join attribute (if not already sorted on the join
attributes).
2. Merge the sorted relations to join them
1. Join step is similar to the merge stage of the sort-merge algorithm.
2. Main difference is handling of duplicate values in join attribute every
pair with same value on join attribute must be matched
Silberschatz, Korth and Sudarshan 13.15 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Merge-Join (Cont.)
Can be used only for equi-joins and natural joins
Each block needs to be read only once (assuming all tuples for any given
value of the join attributes fit in memory)
Thus the cost of merge join is:
b
r
+ b
s
block transfers
+ the cost of sorting if relations are unsorted.
hybrid merge-join: If one relation is sorted, and the other has a
secondary B
+
-tree index on the join attribute
Merge the sorted relation with the leaf entries of the B
+
-tree .
Sort the result on the addresses of the unsorted relations tuples
Scan the unsorted relation in physical address order and merge with
previous result, to replace addresses by the actual tuples
Sequential scan more efficient than random lookup
Silberschatz, Korth and Sudarshan 13.16 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Hash-Join
Applicable for equi-joins and natural joins.
A hash function h is used to partition tuples of both relations
h maps JoinAttrs values to {0, 1, ..., n}, where JoinAttrs denotes the
common attributes of r and s used in the natural join.
r
0
, r
1
, . . ., r
n
denote partitions of r tuples
Each tuple t
r
e r is put in partition r
i
where i = h(t
r
[JoinAttrs]).
S
0
,, S
1
. . ., S
n
denotes partitions of s tuples
Each tuple t
s
es is put in partition s
i
, where i = h(t
s
[JoinAttrs]).

Note: In book, r
i
is denoted as H
ri,
s
i
is denoted as H
si
and
n

is denoted as n
h.

Silberschatz, Korth and Sudarshan 13.17 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Hash-Join (Cont.)
Silberschatz, Korth and Sudarshan 13.18 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Hash-Join (Cont.)
r tuples in r
i
need only to be compared with s tuples in s
i
Need
not be compared with s tuples in any other partition, since:
an r tuple and an s tuple that satisfy the join condition will
have the same value for the join attributes.
If that value is hashed to some value i, the r tuple has to be in
r
i
and the s tuple in s
i
.
Silberschatz, Korth and Sudarshan 13.19 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Hash-Join Algorithm
1. Partition the relation s using hashing function h.
2. Partition r similarly.
3. For each i:
(a) Load s
i
into memory and build an in-memory hash index on it
using the join attribute. This hash index uses a different hash
function than the earlier one h.
(b) Read the tuples in r
i
from the disk one by one. For each tuple
t
r
locate each matching tuple t
s
in s
i
using the in-memory hash
index. Output the concatenation of their attributes.
The hash-join of r and s is computed as follows.
Relation s is called the build input and
r is called the probe input.
Silberschatz, Korth and Sudarshan 13.20 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Complex Joins
Join with a conjunctive condition:
r
u1. u 2.... . u n
s
Either use nested loops/block nested loops, or
Compute the result of one of the simpler joins r
ui
s
final result comprises those tuples in the intermediate result
that satisfy the remaining conditions


u
1
. . . . . u
i 1
. u
i +1
. . . . . u
n

Join with a disjunctive condition

r
u1 v u2 v... v un
s
Either use nested loops/block nested loops, or
Compute as the union of the records in individual joins r
u
i
s:
(r
u1
s) (r
u2
s) . . . (r
un
s)

Silberschatz, Korth and Sudarshan 13.21 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Evaluation of Expressions
So far: we have seen algorithms for individual operations
Alternatives for evaluating an entire expression tree
Materialization: generate results of an expression whose inputs
are relations or are already computed, materialize (store) it on
disk. Repeat.
Pipelining: pass on tuples to parent operations even as an
operation is being executed
We study above alternatives in more detail
Silberschatz, Korth and Sudarshan 13.22 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Materialization
Materialized evaluation: evaluate one operation at a time,
starting at the lowest-level. Use intermediate results
materialized into temporary relations to evaluate next-level
operations.
E.g., in figure below, compute and store


then compute the store its join with customer, and finally
compute the projections on customer-name.
) (
2500
account
balance<
o
Silberschatz, Korth and Sudarshan 13.23 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Materialization (Cont.)
Materialized evaluation is always applicable
Cost of writing results to disk and reading them back can be quite high
Our cost formulas for operations ignore cost of writing results to
disk, so
Overall cost = Sum of costs of individual operations +
cost of writing intermediate results to disk
Double buffering: use two output buffers for each operation, when one
is full write it to disk while the other is getting filled
Allows overlap of disk writes with computation and reduces
execution time
Silberschatz, Korth and Sudarshan 13.24 Database System Concepts - 5
th
Edition, Aug 27, 2005.
Pipelining
Pipelined evaluation : evaluate several operations simultaneously,
passing the results of one operation on to the next.
E.g., in previous expression tree, dont store result of


instead, pass tuples directly to the join.. Similarly, dont store result of
join, pass tuples directly to projection.
Much cheaper than materialization: no need to store a temporary relation
to disk.
Pipelining may not always be possible e.g., sort, hash-join.
For pipelining to be effective, use evaluation algorithms that generate
output tuples even as tuples are received for inputs to the operation.
Pipelines can be executed in two ways: demand driven and producer
driven
) (
2500
account
balance<
o

You might also like