Lesson 9 Mod2l2
Lesson 9 Mod2l2
Module 2, Lecture 2
How index-learning turns no student pale
Yet holds the eel of science by the tail.
-- Alexander Pope (1688-1744)
Database Management Systems, R. Ramakrishnan
Sorted Files:
Files compacted after deletions.
Selections on sort field(s).
Hashed Files:
No overflow buckets, 80% page occupancy.
Cost of Operations
Heap
File
Sorted
File
Hashed
File
Cost of Operations
Heap
File
BD
Sorted
File
BD
Hashed
File
1.25 BD
D log2B
Range Search
BD
D (log2B + # of 1.25 BD
pages with
matches)
Search + BD
2D
Insert
2D
Delete
Search + D Search + BD
2D
Indexes
Three alternatives:
Data record with key value k
<k, rid of data record with search key value k>
<k, list of rids of data records with search key k>
Alternative 1:
If this is used, index structure is a file organization
for data records (like Heap files or sorted files).
At most one index on a given collection of data
records can use Alternative 1. (Otherwise, data
records duplicated, leading to redundant storage
and potential inconsistency.)
If data records very large, # of pages containing
data entries is high. Implies size of auxiliary
information in the index is also large, typically.
Alternatives 2 and 3:
Data entries typically much smaller than data
records. So, better than Alternative 1 with large
data records, especially if search keys are small.
(Portion of index structure used to direct search is
much smaller than with Alternative 1.)
If more than one index is required on a given file, at
most one index can use Alternative 1; rest must use
Alternatives 2 or 3.
Alternative 3 more compact than Alternative 2, but
leads to variable sized data entries even if search
keys are of fixed length.
10
Index Classification
11
CLUSTERED
Index entries
direct search for
data entries
Data entries
UNCLUSTERED
Data entries
(Index File)
(Data file)
Records
Database Management Systems, R.Data
Ramakrishnan
Data Records
12
25
30
Ashby
33
Cass
Smith
40
44
44
Sparse Index
on
Name
Data File
Dense Index
on
Age
13
11
12,10
12
12,20
13,75
<age, sal>
10,12
20,12
75,13
12
bob 12
10
13
cal
11
80
joe 12
20
sue 13
75
<age>
10
Data records
sorted by name
80,11
<sal, age>
20
75
80
<sal>
Data entries
sorted by <sal>
14
Summary
15
Summary (Contd.)
16