0% found this document useful (0 votes)
8 views

Advanced Dbms Unit2

Uploaded by

Swathi SB
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Advanced Dbms Unit2

Uploaded by

Swathi SB
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

DEPT.

OF CSE

18CS8PE31: Advanced DBMS

UNIT-2

Overview of Query Evaluation

INTROUDUCTION:

• The DBMS describes the data that it manages, including tables and indexes. This descriptive data or
metadata stored in special tables called the system catalogs is used to find the best way to evaluate a
query.

• SQL queries are translated into an extended form of relational algebra, and query evaluation plans
are represented as trees of relational operators, along with labels that identify the algorithm to use at
each node.

• Relational operators serve as building blocks for evaluating queries and the implementation of these
operators is carefully optimized for good performance.

• Queries are composed of several operators, and the algorithms for individual operators can be
combined in many ways to evaluate a query. The process of finding a good evaluation plan is called
query optimization.

We consider a number of example queries using the following schema:


Sailors(sid: integer, sname:string, rating: integer, age: real)
Reserves(sid: integer, bid: integer, day: dates, marne: string)
We assume that each tuple of Reserves is 40 bytes long, that a page can hold 100 Reserves tuples, and that
we have 1000 pages of such tuples. Similarly, we assume that each of Sailors is 50 bytes long,
that a page can hold 80 Sailors tuples, and that we have 500 pages of such tuples.

The system catalog:


• We can store a table using one of several alternative file structures, and we can create one or more
indexes – each stored as a file – on every table
• In a relational DBMS, every file contains either the tuples in a table or the entries in an index.
• The collection of files corresponding to users tables and indexes represents the data in the database.
• A relational DBMS maintains information about every table and index that it contains.
• The descriptive information is itself stored in a collection of special tables called the catalog tables.
• The catalog tables are also called the data dictionary, the system catalog or simply the catalog.

Information in the catalog:

At a minimum, we have system-wide information such as the size of the buffer pool and page size, and the
following information about individual tables, indexes and views.
• For each table:
- Its table name, file name and the file structure (e.g., heap file) of the file in which it is stored.
- The attribute name and type of each of its attributes.
- The index name of each index on the table.
- The integrity constraints (e.g., primary key and foreign keyconstraints) on the table.

SSIT, Tumkur Page 1


DEPT. OF CSE

• For each index:


- The index name and the structure (e.g., B+ tree) of the index.
- The search key attributes.
• For each view:
- Its view name and definition
In addition, statistics about tables and indexes are stored in the system catalogs and updated periodically.

The following information is commonly stored:

• Cardiality: The number of tuples NTuples(R) for each table R


• Size: The number of pages NPages(R) for each table R.
• Index Cardinality: The number of distinct key values NKeys(I) for each index I.
• Index Size: The number of pages INPages(I) for each index I.
• Index Height: The number of nonleaf levels IHeight(I) for each tree indiex I.
• Index Range: The minimum present key value ILow(I) and maximum present key value IHigh(I) for
each index I.
The catalogs also contain information about users, such as accounting information and
authorization information.
How Catalogs are Stored:

An elegant aspect of a relational DBMS is that the system catalog is itself a collection of
tables. For example, we might store information about the attributes of tables in a catalog table called
Attribute_Cat:
Attribute_Cat(attr_name: string,rel_name: string, type: string, position: integer)

Sailors(sid: integer, sname: string, integer, age: real)


Reserves(sid: integer, bid: integer, day: dates, mame: string)

Figure 1 shows the tuples in the Attribute_Cat table that describe the at tributes of these two tables.

position
attr- rel-name Type
name
attr_name Attribute_Cat string 1

reLname string 2
Type Attribute_Cat string 3
Position Attribute_Cat integer 4
Sid Sailors integer 1
Sname Sailors string 2
Rating Sailors integer 3
Age Sailors real 4
Sid Reserves integer 1
Bid Reserves integer 2
Day Reserves dates 3
Rname Reserves string 4

Figure 1: An Instance of the Attribute_Cat Relation

SSIT, Tumkur Page 2


DEPT. OF CSE

These other tuples illustrate an important Point: the catalog tables describe all the tables in the
database, including the catalog themselves. When information about a table is needed, it is
obtained from the system catalog. Of course, at the implementation level, whenever the DBMS needs
to find the schema of a catalog table, the code that retrieves this information must be handled
specially.
The fact that the system catalog is also a collection of tables is very useful. For example, catalog tables
can be queried just like any other table, using the query language of the DBMS! Further, all the
techniques available for implementing and managing tables apply directly to catalog tables. The
choice of catalog tables and their is not unique and is made by the implementor of the DBMS.
Real systems vary in their catalog schema design, but the catalog is always implemented as a
collection of tables, and it essentially describes all the data stored in the database.

Introduction to operator evaluation:


Several alternative algorithms are available for implementing each relational operator and for most operators
no algorithm is universally superior.
Several factors influence which algorithm performs best, including the sizes of the tables involved, existing
indexes and sort orders , the size of the available buffer pool, and the buffer replacement policy.
Three Common Techniques:
The algorithms for various relational operators actually have a lot in common. A few simplen techniques are
used to develop algorithms for each operator:
 Indexing: If a selection or join condition is specified, use an index to examine just the tuples that
satisfy the condition. (Selection and condition)
 Iteration: Examine all tuples in an input table one after the other. If we need only a few fields,
instead of examining data tuples, we can scan all index data entries. (And sometimes, we can scan the
data entries in an index instead of the table itself.)
 Partitioning: By partitioning tuples on a sort key, we can often decompose an operation into a less
expensive collection of operations on partitions. Sorting and Hashing are two commonly used
partitioning techniques.

Access Paths:

An access path is a method of retrieving tuples from a table and consists of either(1) a file scan or (2) an index
plus a matching selection condition. Every relational operator accepts one or more tables as input, and
the access methods used to retrieve tuples contribute significantly to the cost of the operator. File scan,
or index that matches a selection condition(in the query).

Consider a simple selection that is a conjunction of conditions of the form attr op value, where
op is one of the comparison operators <, =, or >. Such selections are said to be in
conjunctive normal form (CNF), and each condition is called a conjunct. Intuitively, index
matches a selection condition if the index can be used to retrieve just the tuples that the
condition.

 A hash index matches CNF selection if there is a term of the form in


the selection for each attribute in the index's search key.
E.g., Hash index on <a, b, c> matches a=5 AND b=3 AND c=5; but it does not match b=3, or a=5
AND b=3, or a>5 AND b=3 AND c=5.

SSIT, Tumkur Page 3


DEPT. OF CSE

 A tree index matches a CNF selection if there is a term of the form attribute op value
for each attribute in a of the index's search key. and are prefixes of key
(a,b,c), but and are not.) E.g., Tree index on <a, b, c> matches the selection a=5
AND b=3, and a=5 AND b>6, but not b=3.

A Note on Complex Selections

-(day<8/9/94 AND rname=‘Paul’) OR bid=5 OR sid=3)

-Selection conditions are first converted to conjunctive normal form (CNF)

(day<8/9/94 OR bid=5 OR sid=3 ) AND (rname=‘Paul’ OR bid=5 OR sid=3)

Selectivity of Access Paths:

he selectivity of access path is the number of pages retrieved (index pages plus data pages) if we
use this access path to retrieve all desired tuples. If a table contains an index that matches a given
selection, there are attwo access paths: the index and a scan of the data file. Sometimes, of
course, we can scan index itself, giving us a third access path.

• Find the most selective access path, retrieve tuples using it, and apply any remaining terms
that don’t match the index:
• Most selective access path: An index or file scan that we estimate will require the fewest
page I/Os.
• Terms that match this index reduce the number of tuples retrieved; other terms are used to
discard some retrieved tuples, but do not affect number of tuples/pages fetched.
Consider day<8/9/94 AND bid=5 AND sid=3. A B+ tree index on day can be used; then, bid=5
and sid=3 must be checked for each retrieved tuple. Similarly, a hash inde x on <bid, sid> could be
used; day<8/9/94 must then be checked.
Algorithms for relational operations:

1. Selection: The selection operation is a simple retrieval of tuples from a table, and its implementation is
essentially covered in our discussion of access paths.
Given a selection of the form  R.attr op value(R), if there is no index on R attr, we have to scan R.
If one or more indexes on R match the selection, we can use index to retrieve matching tuples and apply any
matching selection conditions to further restrict the result set.
Using an Index for Selections
• Cost depends on #qualifying tuples, and clustering.
• Cost of finding qualifying data entries (typically small)plus cost of retrieving records (could be large
w/o clustering).
In example, assuming uniform distribution of names,about 10% of tuples qualify (100 pages, 10000 tuples).
With a clustered index, cost is little more than 100 I/Os; if unclustered, upto 10000 I/Os!
SELECT * FROM Reserves R WHERE R.rname < ‘C%’;
As a rule thumb, it is probably cheaper to simply scan the entire table(instead of using an unclustered index)
if over 5% of the tuples are to be retrieved.
SSIT, Tumkur Page 4
DEPT. OF CSE

2. Projection: The projection operation requires us to drop certain fields of the input, which is easy to do. The
expensive part is removing duplicates.
SQL systems don’t remove duplicates unless the keyword DISTINCT is specified in a query.
SELECT DISTINCT R.sid, R.bid FROM Reserves R;
• Sorting Approach: Sort on <sid, bid> and remove duplicates. (Can optimize this by dropping
unwanted information while sorting.)
• Hashing Approach: Hash on <sid, bid> to create partitions. Load partitions into memory one at a
time, build in-memory hash structure, and eliminate duplicates.
• If there is an index with both R.sid and R.bid in the search key, may be cheaper to sort data entries!
3. Join: Joins are expensive operations and very common. This systems typically support several algorithms to
carry out joins.
Consider the join of Reserves and Sailors, with the join condition Reserves.sid=Sailors.sid. Suppose one of
the tables, say Sailors, has an index on the sid column. We can scan Reserves and for each tuple, use the index
to probe Sailors for matching tuples. This approach is called index nested loops join.
Ex: The cost of scanning Reserves and using the index to retrieve the matching Sailors tuple for each
Reserves tuple. The cost of scanning Reserves is 1000. There are 100*1000 =100000 tuples in Reserves. For
each of these tuples, retrieving the index page containing the rid of the matching Sailors tuple costs 1.2
I/Os(on avg), in addition we have to retrieve the Sailors page containing the qualifying tuple. Therefore we
have 100000*(1+1.2) I/Os to retrieve matching Sailors tuples. The total cost is 221000 I/Os.

• If we do not have an index that matches the join condition on either table, we cannot use index nested
loops. In this case, we can sort both tables on the join column, and then scan them to find matches.
This is called sort-merge join.
Ex: We can sort Reserves and Sailors in two passes. Read and Write Reserves in each pass the sorting cost
is 2*2*1000=4000 I/Os. Similarly we can sort Sailors at a cost of 2*2*500=2000 I/Os. In addition, the second
phase of the sort-merge join algorithm requires an additional scan of both tables. Thus the total cost is
4000+2000+1000+500=7500 I/Os.
Introduction to query optimization:
• Query optimization is one of the most important tasks of a relational DBMS. A more detailed view of
the query optimization and execution layer in the DBMS architecture is as shown in Fig(2).Queries
are parsed and then presented to query optimizer, which is responsible for identifying an efficient
execution plan. The optimizer generates alternative plan and chooses the plan with the least estimated
cost.
• The space of plans considered by a typical relational query optimizer can be carried out on the result
of the - - algebra expression.

SSIT, Tumkur Page 5


DEPT. OF CSE

Query

Query Parser

Parsed query

Query Optimizer

Plan Plan cost


Catalog Manager
Generator Estimator

Evaluation plan

Query Plan Evaluator

Fig(2): Query Parsing, Optimization and Execution


Optimizing such a relational algebra expression involves two basic steps:
1. Enumerating alternative plans for evaluating the expression. Typically an optimizer considers a subset
of all possible plans because the number of possible plans is very large.
2. Estimating the cost of each enumerated plan and choosing the plan with the lowest estimated cost.
Query Evaluation Plans:
A query evaluation plan consists of an extended relational algebra tree, with additional annotations at each
node indicating the access methods to use for each table and the implementation method to use for each
relational operator.
Ex: select s.sname from Reserves r, Sailors s where r.sid=s.sid and r.bid=100 and s.rating>5
The above query can be expressed in relational algebra as follows:
 Sname( bid=100 ^ rating>5(Reserves sid=sid Sailors))
This expression is shown in the form of a tree in Fig.(3).
 Sname

 bid=100 ^ rating>5

sid=sid

Reserves Sailors
Fig(3). Query Expressed as a Relational Algebra Tree

SSIT, Tumkur Page 6


DEPT. OF CSE

The algebra expression partially specifies how to evaluate the query – we first compute the natural join of
Reserves and Sailors, then perform the selections and finally project the sname field.
To obtain a fully specified evaluation plan, we must decide on an implementation for each of the algebra
operations involved. For ex, we can use a page-oriented simple nested loops join with Reserves as the outer
table and apply selections and projections to each tuple in the result of the join as it is produced; the result of
the join before the selections and projections is never stored in its entirety. This query evaluation plan is
shown in Fig.(4)
 Sname (on-the-fly)

 bid=100 ^ rating>5 (on-the-fly)

sid=sid (simple nested loops)

(File scan)Reserves Sailors(File scan)


Fig(4). Query Evaluation Plan for Sample Query
Multi-Operator Queries: Pipelined Evaluation
When a query is composed of several operators, the result of one operator is sometimes pipelined to another
operator without creating a temporary table to hold the intermediate results. The plan in fig(4) pipelines the
output the join Sailors and Reserves into selections and projections that follow.

Results tuples of C
first join pipelined
into join with C
A B

Fig(5). A Query Tree Illustrating Pipelining

In Fig(5) both joins can be evaluated in pipelined fashion using some version of a nested loops join.
Conceptually, the evaluation is initialized from the root, and the node joining A and B produces the tuples as
and when they are requested by its parent node. When the root node gets a page of tuples from left child, all
the matching inner tuples are retrieved and joined with matching outer tuples; the current page of outer tuples
is then discarded. A request is then made to the left child for the next page of tuples, and the process is
repeated. Pipelined evaluation is thus control strategy governing the rate at which different joins in the plan
proceed. It has the great virtue of not writing the result of not writing the result of intermediate joins to a
temporary file because the results are produced, consumed and discarded one page at a time.

SSIT, Tumkur Page 7


DEPT. OF CSE

The Iterator Interface:

A query evaluation plan is a tree of relational operators and is executed by calling the operators in
some (possibly interleaved) order. Each operator has one or more inputs and an output, which are
also nodes in the plan, and tuples must be between operators according to the plan's tree
structure.

To simplify the code responsible for coordinating the execution of a plan, the relational operators that
form the nodes of a plan tree (which is to be evaluated using pipelining) typically support a uniform
iterator interface, hiding the internal implementation details of each operator. The iterator
interface for an operator includes the functions open, get_next, and close. The open function
initializes the state of the iterator by allocating buffers for its inputs and output, and is also used to
pass in arguments such selection conditions that modify the behavior of the operator. The code
for the function calls the function on each input node and calls operator-specific
code to process the input tuples. The output tuples generated by the processing are placed in the
output buffer of the operator, and the state of the iterator is updated to keep track of how much input
been consumed. When all output tuples have been produced through repeated calls to the
close function is called (by the code that initiated execution of this operator) to deallocate state
information.
The iterator interface supports pipelining of results naturally: the decision to pipeline or
input tuples is encapsulated in the operator-specific code that processes input tuples. If the
algorithm implemented for the operator allows input tuples to be processed completely when
yhey are received, input tuples are not materialized and the evaluation is pipelined. If the algorithm
examines the same input tuples several times, they are materialized. This decision, like other details
of the operator's implementation, is hidden by the iterator interface for the operator.
The iterator interface is also used to encapsulate access methods such as B+ trees and hash-based
indexes. Externally, access methods can be viewed simply as operators that produce a stream of
output tuples. In this case, the open function can be used to pass the selection conditions that match
the access path.

Alternative Plans: Motivating Example


SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5;
• Cost: 500+500*1000 I/Os
• By no means the worst plan!
• Misses several opportunities: selections could have been `pushed’ earlier, no use is made of any
available indexes, etc.
• Goal of optimization: To find more efficient plans that compute the same answer RA Tree: is as
shown in fig(2) and plan is as shown in fig(3)

SSIT, Tumkur Page 8


DEPT. OF CSE

Pushing Selections:
A join is a relatively expensive operation, and a good heuristic is to reduce the sizes of the tables to be
joined as much as possible. One approach is to apply selections early; if a selection operator appears after
a join operator, it is worth examining whether the selection can be 'pushed' ahead of the join. As an
example, the selection bid=100 involves only the attributes of Reserves and can be applied to Reserves
the join. Similarly, the selection 5 involves only attributes of Sailors and can be applied to
Sailors before the join. Let us suppose that the selections are performed using simple file scan, that the
result of each selection is written to temporary table on disk, and that the temporary tables are then
joined using sort-merge join. The resulting query evaluation plan is shown in Figure(5).
 Sname (on-the-fly)

sid=sid (sort-merge join)

(Scan; Write to temp T1)  bid=100  rating>5 (Scan; write to temp T2)

(File scan)Reserves Sailors(File scan)


Fig(5). A Second Query Evaluation Plan
To estimate the size of Tl, we require additional information. For example, if we assume that the maximum
number of reservations of a given boat is one, just one tuple appears in the result. Alternatively, if we know
that there are 100 boats, we can assume that reservations are spread out uniformly across all boats and
estimate the number of pages in Tl to be 10. For concreteness, assume that the number of pages in Tl is
indeed 10.
The cost of applying rating > 5 to Sailors is the cost of scanning Sailors (500 pages) plus the cost of
writing out the result to a temporary table, say, T2. If we assume that ratings are uniformly distributed
over the range 1 to 10, we can approximately estimate the size of T2 as 250 pages.

SSIT, Tumkur Page 9


DEPT. OF CSE

EXTERNAL SORTING
Why Sort?

• A classic problem in computer science!

• Data requested in sorted order

- e.g., find students in increasing gpa order

• Sorting is first step in bulk loading B+ tree index.

• Sorting useful for eliminating duplicate copies in a collection of records (Why?)

• Sort-merge join algorithm involves sorting.

• Problem: sort 1Gb of data with 1Mb of RAM.

- why not virtual memory?

When does a DBMS sort data?

Sorting a collection of records on some search key is a very useful operation. The key can be a single
attribute or an ordered list of attributes. Sorting is required in a variety of situations, including the following
important ones:

• Users may want answers in some order; for example, by increasing age.

• Sorting records is the first step in bulk loading a tree index.

• Sorting is useful for eliminating duplicate copies in a collection of records.

• A widely used algorithm for performing a very important relational algebra operation, called join,
requires a sorting step.

Although main memory sizes are growing rapidly the ubiquity of database systems has lead to increasingly
larger datasets as well. When the data to be sorted is too large to fit into available main memory, we need an
external sorting algorithm. Such algorithm seek to minimize the cost of disk accesses.

A SIMPLE TWO-WAY MERGE SORT

We begin by presenting a simple algorithm to illustrate the idea behind external sorting. This algorithm
utilizes only three pages of main memory, and it is presented only for pedagogical purposes. When sorting a
file, several sorted subfiles are typically generated in intermediate steps. Here we refer to each subfile as a
run.

Even if the entire file does not fit into the available main memory, we can sort it by breaking it into smaller
subfiles, sorting these subfiles and then merging them using a minimal amount of main memory at any given
time. In the first pass, the pages in the file are read in one at a time. After a page is read in, the records on it
are sorted and the sorted page is written out. Quicksort or any other in –memory sorting technique can be used
SSIT, Tumkur Page 10
DEPT. OF CSE

to sort the records on a page. In subsequent passes, pairs of runs from the output of the previous pass are read
in and merged to produce runs that are twice as long. This algorithm is shown in Fig(1).

proc 2-way_extsort (file)

// Given a file on disk; sorts it using three buffer pages

//Produce runs that are one page long: Pass 0

Read each page into memory, sort it, Write it out.

//Merge pairs of runs to produce longer runs until only

//one run ( containing all records of input file) is left

While the number of runs at end of previous pass is > 1:

// Pass i=1,2,….

While there are runs to be merged from previous pass:

Choose next two runs (from previous pass).

Read each run into an input buffer; page at a time.

Merge the runs and write to the output buffer;

force output buffer to disk one page at a time.

endproc

Fig(1). Two-Way Merge Sort

If the number of pages in the input file is 2k, for some k, then:

Pass 0 produces 2k sorted runs of one page each,

Pass 1 produces 2k-1 sorted runs of two pages each,

Pass 2 produces 2k-2 sorted runs of four pages each,

And so on, until

Pass K produces one sorted run of 2k pages.

In each pass, we read every page in the file, process it and write it out. Therefore we have two disk I/Os per
page, per pass. The number of passes is [log2N]+1, where N is the number of pages in the file. The overall
cost is 2N([log2N]+1) I/Os.

SSIT, Tumkur Page 11


DEPT. OF CSE

3,4 6,2 9,4 8,7 5,6 3,1 2 Input file

Pass 0

3,4 2,6 1-page runs


4,9 7, 5,6 1,3 2
8 Pass 1

2,3 4,7 1,3


2-page runs
4,6 8,9 5,6 2
Pass 2
2,3

4,4 1,2
3,5
6,7
6
8,9
4-page runs

Pass 3

1,2
8- page runs
2,3

3,4

4,5

6,6
7,8

The algorithm is illustrated on an example input file containing


Two-way Merge Sort of a seven pages in fig(2).

SSIT, Tumkur Page 12


DEPT. OF CSE

The sort takes four passes, and in each pass, we read and write seven pages, for a total of 56 I/Os. This
result agrees with the preceding analysis because 2.7([log 27]+1)=56. The dark pages in the figure illustrate
what would happen on a file of eight pages; the number of passes remains at four([log 28]+1=4), but we read
and write an additional page in each pass for a total of 64 I/Os.

The algorithm requires just three buffer pages in main memory, as shown in fig(3) illustrates. This
observation raises an important point: Even if we have more buffer space available, this simple algorithm
does not utilize it effectively.

Fig(3) Two-Way Merge Sort with Three buffer pages

Input 1

Output
Input 2

Disk Disk

EXTERNAL MERGE SORT


Suppose that B buffer pages are available in memory and that we need to sort large file with N pages. The
intuition behind the generalized algorithm that we now present is to retain the basic structure of making
multiple passes while trying to minimize the number of passes. There are two important modifications to the
two-way merge sort algorithm:

1. In pass 0, read in B pages at a time and sort internally to produce [N/B] runs of B pages each (except
for the last run, which may contain fewer pages). This modification is illustrated in fig(4), using the
input from fig(2) and a buffer pool with four pages.

2. In passes i=1,2,… use B-1 buffer pages for input and use the remaining page for output; hence, you do
a (B-1)-way merge in each pass. The utilization of buffer pages in the merging passes is illustrated in
fig(5).

SSIT, Tumkur Page 13


DEPT. OF CSE

Fig(4). External Merge Sort with B buffer pages: Pass 0

2,3
3,4

6,2 4,4
1st output run
6,7
1,2
8,9 3,4 8,9

2,3

Input file

8,9 7,8
1,2
2nd output run
5,6

3,5

3,1
Buffer pool with B=4 pages 6

SSIT, Tumkur Page 14


DEPT. OF CSE

Input 1

Input 2
OUTPUT

Input B-1

B main buffer pages

Fig(5) External Merge Sort with B Buffer Pages: Pass i >0

Disk Disk
The first refinement reduces the number of runs produced by Pass 0 to N1= [N/B], versus N for the two-way merge. The
second refinement is even more important. By doing a (B-1)-way merge, the number of passes is reduced dramatically-
including the initial pass, it becomes [logB-1N1]+1 versus [log2N]+1 for the two-way merge algorithm presented earlier.
Because B is typically quite large, the savings can be substantial. The external merge sort algorithm is shown in Fig(6).

proc extsort (file)

// Given a file on disk, sorts it using three buffer pages

SSIT, Tumkur Page 15


DEPT. OF CSE

//Produce runs that are B pages long: Pass 0

Read B page into memory, sort them, Write out a run.

//Merge B-1 runs at a time to produce longer runs until only

//one run ( containing all records of input file) is left

While the number of runs at end of previous pass is > 1:

// Pass i=1,2,….

While there are runs to be merged from previous

pass: Choose next B-1 runs (from previous pass).

Read each run into an input buffer; page at a

time. Merge the runs and write to the output

buffer; force output buffer to disk one page at a

time.

Endproc Fig(6). External Merge sort.

SSIT, Tumkur Page 16


DEPT. OF CSE

As an example, suppose that we have five buffer pages available and want to sort a file with 108 pages.

Pass 0 produces [108/5]=22 sorted runs of five pages each, except for the last run, which is only three pages
long.

Pass 1 does a four-way merge to produce [22/4]=six sorted of 20 pages each, except for the last run, which is
only eight pages long.

Pass 2 produces[6/4]= two sorted runs; one with 80 pages and one with 28 pages.

Psss 3 merges the two runs produced in Pass 2 to produce the sorted file.

In each pass we read and write 108 pages; thus the total cost is 2*108*4= 864 I/Os. Applying our formula, we
have N1=[108/5]=22 and cost=2*N*([logB-1N1]+1)= 2*108*([log422]+1)=864, as expected.

To emphasize the potential gains in using all available buffers, in fig(7), we show the number of passes,
computed using our formula, for several values of N and B. To obtain the cost, the number of passes should
be multiplied by 2N. In practice, one would expect to have more than 257 buffers, but this table illustrates the
importance of a high fan-in during merging.

N B=3 B=5 B=9 B=17 B=129 B=257

100 7 4 3 2 1 1

1000 10 5 4 3 2 2

10000 13 7 5 4 2 2

100000 17 9 6 5 3 3

1000000 20 10 7 5 3 3

10000000 23 12 8 6 4 3

100000000 26 14 9 7 4 4

1000000000 30 15 10 8 5 4

SSIT, Tumkur Page 17

You might also like