Advanced Dbms Unit2
Advanced Dbms Unit2
OF CSE
UNIT-2
INTROUDUCTION:
• The DBMS describes the data that it manages, including tables and indexes. This descriptive data or
metadata stored in special tables called the system catalogs is used to find the best way to evaluate a
query.
• SQL queries are translated into an extended form of relational algebra, and query evaluation plans
are represented as trees of relational operators, along with labels that identify the algorithm to use at
each node.
• Relational operators serve as building blocks for evaluating queries and the implementation of these
operators is carefully optimized for good performance.
• Queries are composed of several operators, and the algorithms for individual operators can be
combined in many ways to evaluate a query. The process of finding a good evaluation plan is called
query optimization.
At a minimum, we have system-wide information such as the size of the buffer pool and page size, and the
following information about individual tables, indexes and views.
• For each table:
- Its table name, file name and the file structure (e.g., heap file) of the file in which it is stored.
- The attribute name and type of each of its attributes.
- The index name of each index on the table.
- The integrity constraints (e.g., primary key and foreign keyconstraints) on the table.
An elegant aspect of a relational DBMS is that the system catalog is itself a collection of
tables. For example, we might store information about the attributes of tables in a catalog table called
Attribute_Cat:
Attribute_Cat(attr_name: string,rel_name: string, type: string, position: integer)
Figure 1 shows the tuples in the Attribute_Cat table that describe the at tributes of these two tables.
position
attr- rel-name Type
name
attr_name Attribute_Cat string 1
reLname string 2
Type Attribute_Cat string 3
Position Attribute_Cat integer 4
Sid Sailors integer 1
Sname Sailors string 2
Rating Sailors integer 3
Age Sailors real 4
Sid Reserves integer 1
Bid Reserves integer 2
Day Reserves dates 3
Rname Reserves string 4
These other tuples illustrate an important Point: the catalog tables describe all the tables in the
database, including the catalog themselves. When information about a table is needed, it is
obtained from the system catalog. Of course, at the implementation level, whenever the DBMS needs
to find the schema of a catalog table, the code that retrieves this information must be handled
specially.
The fact that the system catalog is also a collection of tables is very useful. For example, catalog tables
can be queried just like any other table, using the query language of the DBMS! Further, all the
techniques available for implementing and managing tables apply directly to catalog tables. The
choice of catalog tables and their is not unique and is made by the implementor of the DBMS.
Real systems vary in their catalog schema design, but the catalog is always implemented as a
collection of tables, and it essentially describes all the data stored in the database.
Access Paths:
An access path is a method of retrieving tuples from a table and consists of either(1) a file scan or (2) an index
plus a matching selection condition. Every relational operator accepts one or more tables as input, and
the access methods used to retrieve tuples contribute significantly to the cost of the operator. File scan,
or index that matches a selection condition(in the query).
Consider a simple selection that is a conjunction of conditions of the form attr op value, where
op is one of the comparison operators <, =, or >. Such selections are said to be in
conjunctive normal form (CNF), and each condition is called a conjunct. Intuitively, index
matches a selection condition if the index can be used to retrieve just the tuples that the
condition.
A tree index matches a CNF selection if there is a term of the form attribute op value
for each attribute in a of the index's search key. and are prefixes of key
(a,b,c), but and are not.) E.g., Tree index on <a, b, c> matches the selection a=5
AND b=3, and a=5 AND b>6, but not b=3.
he selectivity of access path is the number of pages retrieved (index pages plus data pages) if we
use this access path to retrieve all desired tuples. If a table contains an index that matches a given
selection, there are attwo access paths: the index and a scan of the data file. Sometimes, of
course, we can scan index itself, giving us a third access path.
• Find the most selective access path, retrieve tuples using it, and apply any remaining terms
that don’t match the index:
• Most selective access path: An index or file scan that we estimate will require the fewest
page I/Os.
• Terms that match this index reduce the number of tuples retrieved; other terms are used to
discard some retrieved tuples, but do not affect number of tuples/pages fetched.
Consider day<8/9/94 AND bid=5 AND sid=3. A B+ tree index on day can be used; then, bid=5
and sid=3 must be checked for each retrieved tuple. Similarly, a hash inde x on <bid, sid> could be
used; day<8/9/94 must then be checked.
Algorithms for relational operations:
1. Selection: The selection operation is a simple retrieval of tuples from a table, and its implementation is
essentially covered in our discussion of access paths.
Given a selection of the form R.attr op value(R), if there is no index on R attr, we have to scan R.
If one or more indexes on R match the selection, we can use index to retrieve matching tuples and apply any
matching selection conditions to further restrict the result set.
Using an Index for Selections
• Cost depends on #qualifying tuples, and clustering.
• Cost of finding qualifying data entries (typically small)plus cost of retrieving records (could be large
w/o clustering).
In example, assuming uniform distribution of names,about 10% of tuples qualify (100 pages, 10000 tuples).
With a clustered index, cost is little more than 100 I/Os; if unclustered, upto 10000 I/Os!
SELECT * FROM Reserves R WHERE R.rname < ‘C%’;
As a rule thumb, it is probably cheaper to simply scan the entire table(instead of using an unclustered index)
if over 5% of the tuples are to be retrieved.
SSIT, Tumkur Page 4
DEPT. OF CSE
2. Projection: The projection operation requires us to drop certain fields of the input, which is easy to do. The
expensive part is removing duplicates.
SQL systems don’t remove duplicates unless the keyword DISTINCT is specified in a query.
SELECT DISTINCT R.sid, R.bid FROM Reserves R;
• Sorting Approach: Sort on <sid, bid> and remove duplicates. (Can optimize this by dropping
unwanted information while sorting.)
• Hashing Approach: Hash on <sid, bid> to create partitions. Load partitions into memory one at a
time, build in-memory hash structure, and eliminate duplicates.
• If there is an index with both R.sid and R.bid in the search key, may be cheaper to sort data entries!
3. Join: Joins are expensive operations and very common. This systems typically support several algorithms to
carry out joins.
Consider the join of Reserves and Sailors, with the join condition Reserves.sid=Sailors.sid. Suppose one of
the tables, say Sailors, has an index on the sid column. We can scan Reserves and for each tuple, use the index
to probe Sailors for matching tuples. This approach is called index nested loops join.
Ex: The cost of scanning Reserves and using the index to retrieve the matching Sailors tuple for each
Reserves tuple. The cost of scanning Reserves is 1000. There are 100*1000 =100000 tuples in Reserves. For
each of these tuples, retrieving the index page containing the rid of the matching Sailors tuple costs 1.2
I/Os(on avg), in addition we have to retrieve the Sailors page containing the qualifying tuple. Therefore we
have 100000*(1+1.2) I/Os to retrieve matching Sailors tuples. The total cost is 221000 I/Os.
• If we do not have an index that matches the join condition on either table, we cannot use index nested
loops. In this case, we can sort both tables on the join column, and then scan them to find matches.
This is called sort-merge join.
Ex: We can sort Reserves and Sailors in two passes. Read and Write Reserves in each pass the sorting cost
is 2*2*1000=4000 I/Os. Similarly we can sort Sailors at a cost of 2*2*500=2000 I/Os. In addition, the second
phase of the sort-merge join algorithm requires an additional scan of both tables. Thus the total cost is
4000+2000+1000+500=7500 I/Os.
Introduction to query optimization:
• Query optimization is one of the most important tasks of a relational DBMS. A more detailed view of
the query optimization and execution layer in the DBMS architecture is as shown in Fig(2).Queries
are parsed and then presented to query optimizer, which is responsible for identifying an efficient
execution plan. The optimizer generates alternative plan and chooses the plan with the least estimated
cost.
• The space of plans considered by a typical relational query optimizer can be carried out on the result
of the - - algebra expression.
Query
Query Parser
Parsed query
Query Optimizer
Evaluation plan
bid=100 ^ rating>5
sid=sid
Reserves Sailors
Fig(3). Query Expressed as a Relational Algebra Tree
The algebra expression partially specifies how to evaluate the query – we first compute the natural join of
Reserves and Sailors, then perform the selections and finally project the sname field.
To obtain a fully specified evaluation plan, we must decide on an implementation for each of the algebra
operations involved. For ex, we can use a page-oriented simple nested loops join with Reserves as the outer
table and apply selections and projections to each tuple in the result of the join as it is produced; the result of
the join before the selections and projections is never stored in its entirety. This query evaluation plan is
shown in Fig.(4)
Sname (on-the-fly)
Results tuples of C
first join pipelined
into join with C
A B
In Fig(5) both joins can be evaluated in pipelined fashion using some version of a nested loops join.
Conceptually, the evaluation is initialized from the root, and the node joining A and B produces the tuples as
and when they are requested by its parent node. When the root node gets a page of tuples from left child, all
the matching inner tuples are retrieved and joined with matching outer tuples; the current page of outer tuples
is then discarded. A request is then made to the left child for the next page of tuples, and the process is
repeated. Pipelined evaluation is thus control strategy governing the rate at which different joins in the plan
proceed. It has the great virtue of not writing the result of not writing the result of intermediate joins to a
temporary file because the results are produced, consumed and discarded one page at a time.
A query evaluation plan is a tree of relational operators and is executed by calling the operators in
some (possibly interleaved) order. Each operator has one or more inputs and an output, which are
also nodes in the plan, and tuples must be between operators according to the plan's tree
structure.
To simplify the code responsible for coordinating the execution of a plan, the relational operators that
form the nodes of a plan tree (which is to be evaluated using pipelining) typically support a uniform
iterator interface, hiding the internal implementation details of each operator. The iterator
interface for an operator includes the functions open, get_next, and close. The open function
initializes the state of the iterator by allocating buffers for its inputs and output, and is also used to
pass in arguments such selection conditions that modify the behavior of the operator. The code
for the function calls the function on each input node and calls operator-specific
code to process the input tuples. The output tuples generated by the processing are placed in the
output buffer of the operator, and the state of the iterator is updated to keep track of how much input
been consumed. When all output tuples have been produced through repeated calls to the
close function is called (by the code that initiated execution of this operator) to deallocate state
information.
The iterator interface supports pipelining of results naturally: the decision to pipeline or
input tuples is encapsulated in the operator-specific code that processes input tuples. If the
algorithm implemented for the operator allows input tuples to be processed completely when
yhey are received, input tuples are not materialized and the evaluation is pipelined. If the algorithm
examines the same input tuples several times, they are materialized. This decision, like other details
of the operator's implementation, is hidden by the iterator interface for the operator.
The iterator interface is also used to encapsulate access methods such as B+ trees and hash-based
indexes. Externally, access methods can be viewed simply as operators that produce a stream of
output tuples. In this case, the open function can be used to pass the selection conditions that match
the access path.
Pushing Selections:
A join is a relatively expensive operation, and a good heuristic is to reduce the sizes of the tables to be
joined as much as possible. One approach is to apply selections early; if a selection operator appears after
a join operator, it is worth examining whether the selection can be 'pushed' ahead of the join. As an
example, the selection bid=100 involves only the attributes of Reserves and can be applied to Reserves
the join. Similarly, the selection 5 involves only attributes of Sailors and can be applied to
Sailors before the join. Let us suppose that the selections are performed using simple file scan, that the
result of each selection is written to temporary table on disk, and that the temporary tables are then
joined using sort-merge join. The resulting query evaluation plan is shown in Figure(5).
Sname (on-the-fly)
(Scan; Write to temp T1) bid=100 rating>5 (Scan; write to temp T2)
EXTERNAL SORTING
Why Sort?
Sorting a collection of records on some search key is a very useful operation. The key can be a single
attribute or an ordered list of attributes. Sorting is required in a variety of situations, including the following
important ones:
• Users may want answers in some order; for example, by increasing age.
• A widely used algorithm for performing a very important relational algebra operation, called join,
requires a sorting step.
Although main memory sizes are growing rapidly the ubiquity of database systems has lead to increasingly
larger datasets as well. When the data to be sorted is too large to fit into available main memory, we need an
external sorting algorithm. Such algorithm seek to minimize the cost of disk accesses.
We begin by presenting a simple algorithm to illustrate the idea behind external sorting. This algorithm
utilizes only three pages of main memory, and it is presented only for pedagogical purposes. When sorting a
file, several sorted subfiles are typically generated in intermediate steps. Here we refer to each subfile as a
run.
Even if the entire file does not fit into the available main memory, we can sort it by breaking it into smaller
subfiles, sorting these subfiles and then merging them using a minimal amount of main memory at any given
time. In the first pass, the pages in the file are read in one at a time. After a page is read in, the records on it
are sorted and the sorted page is written out. Quicksort or any other in –memory sorting technique can be used
SSIT, Tumkur Page 10
DEPT. OF CSE
to sort the records on a page. In subsequent passes, pairs of runs from the output of the previous pass are read
in and merged to produce runs that are twice as long. This algorithm is shown in Fig(1).
// Pass i=1,2,….
endproc
If the number of pages in the input file is 2k, for some k, then:
In each pass, we read every page in the file, process it and write it out. Therefore we have two disk I/Os per
page, per pass. The number of passes is [log2N]+1, where N is the number of pages in the file. The overall
cost is 2N([log2N]+1) I/Os.
Pass 0
4,4 1,2
3,5
6,7
6
8,9
4-page runs
Pass 3
1,2
8- page runs
2,3
3,4
4,5
6,6
7,8
The sort takes four passes, and in each pass, we read and write seven pages, for a total of 56 I/Os. This
result agrees with the preceding analysis because 2.7([log 27]+1)=56. The dark pages in the figure illustrate
what would happen on a file of eight pages; the number of passes remains at four([log 28]+1=4), but we read
and write an additional page in each pass for a total of 64 I/Os.
The algorithm requires just three buffer pages in main memory, as shown in fig(3) illustrates. This
observation raises an important point: Even if we have more buffer space available, this simple algorithm
does not utilize it effectively.
Input 1
Output
Input 2
Disk Disk
1. In pass 0, read in B pages at a time and sort internally to produce [N/B] runs of B pages each (except
for the last run, which may contain fewer pages). This modification is illustrated in fig(4), using the
input from fig(2) and a buffer pool with four pages.
2. In passes i=1,2,… use B-1 buffer pages for input and use the remaining page for output; hence, you do
a (B-1)-way merge in each pass. The utilization of buffer pages in the merging passes is illustrated in
fig(5).
2,3
3,4
6,2 4,4
1st output run
6,7
1,2
8,9 3,4 8,9
2,3
Input file
8,9 7,8
1,2
2nd output run
5,6
3,5
3,1
Buffer pool with B=4 pages 6
Input 1
Input 2
OUTPUT
Input B-1
Disk Disk
The first refinement reduces the number of runs produced by Pass 0 to N1= [N/B], versus N for the two-way merge. The
second refinement is even more important. By doing a (B-1)-way merge, the number of passes is reduced dramatically-
including the initial pass, it becomes [logB-1N1]+1 versus [log2N]+1 for the two-way merge algorithm presented earlier.
Because B is typically quite large, the savings can be substantial. The external merge sort algorithm is shown in Fig(6).
// Pass i=1,2,….
time.
As an example, suppose that we have five buffer pages available and want to sort a file with 108 pages.
Pass 0 produces [108/5]=22 sorted runs of five pages each, except for the last run, which is only three pages
long.
Pass 1 does a four-way merge to produce [22/4]=six sorted of 20 pages each, except for the last run, which is
only eight pages long.
Pass 2 produces[6/4]= two sorted runs; one with 80 pages and one with 28 pages.
Psss 3 merges the two runs produced in Pass 2 to produce the sorted file.
In each pass we read and write 108 pages; thus the total cost is 2*108*4= 864 I/Os. Applying our formula, we
have N1=[108/5]=22 and cost=2*N*([logB-1N1]+1)= 2*108*([log422]+1)=864, as expected.
To emphasize the potential gains in using all available buffers, in fig(7), we show the number of passes,
computed using our formula, for several values of N and B. To obtain the cost, the number of passes should
be multiplied by 2N. In practice, one would expect to have more than 257 buffers, but this table illustrates the
importance of a high fan-in during merging.
100 7 4 3 2 1 1
1000 10 5 4 3 2 2
10000 13 7 5 4 2 2
100000 17 9 6 5 3 3
1000000 20 10 7 5 3 3
10000000 23 12 8 6 4 3
100000000 26 14 9 7 4 4
1000000000 30 15 10 8 5 4