Indexing
Indexing
request stored
stored record
record returned
File Manager
request stored
stored block
block returned
Disk Manager
Stored Database
Unordered Files
Also called a heap or a pile file.
New records are inserted at the end of the file.
A linear search through the file records is necessary
to search for a record.
This requires reading and searching half the file blocks on
the average, and is hence quite expensive.
Record insertion is quite efficient.
Reading the records in order of a particular field
requires sorting the file records.
Ordered Files
Also called a sequential file.
File records are kept sorted by the values of an ordering field.
Insertion is expensive: records must be inserted in the correct
order.
It is common to keep a separate unordered overflow (or
department
instructor
multitable clustering
of department and
instructor
Multitable Clustering File Organization (cont.)
integrity constraints
Why is it important?
How DBMS Accesses Data?
The operations read, modify, update, and
delete are used to access data from
database.
Location Mechanism
Location mechanism
facilitates finding
index entry for S
S Index entries
block
Search K: find entry with largest key K
Sparse Vs Dense Index
Dense index: index entry for each data
record
Unclustered index must be dense
Clustered index need not be dense
Sparse index: index entry for each block
of data file
Sparse Vs. Dense Index
Id Name Dept
Sparse,
clustered
index sorted
on Id
data file sorted Dense,
on Id unclustered
index sorted
on Name
Clustered vs. Unclustered Index
Data
Index Block 0
Block 0
M
Data
Block 1
M
Index
Block 1
M
M
CIS552 Indexing and Hashing 50
Secondary indexes
SELECT name, address
FROM MovieStar
WHERE birthdate=DATE ‘1952-01-01’
CREATE INDEX BDIndex ON MovieStar(birthdate);
10 10
10 20
20
20 50
30
20
30 10
40 50
50
60
20
Pointers in one index block may refer to
multiple data blocks
Results in more number of Disk I/Os
Unavoidable problem
Using ‘bucket file’ between index file and data
file
Single entry <k,p> for each value ‘k’ where p
points to location in bucket file containing all
other pointers of records with value ‘k’
Avoids wastage of space due to multiple storage
of same value ‘k’
Definition of Bucket
10 10
20 20
30
40 50
30
50
60 10
50
60
Index file 20
Disney 1995
value in an attribute
Range searches – records with an attribute
61
Primary and Secondary Indices
disk
o Block fetch requires about 5 to 10 micro seconds,