Index 1
Index 1
SQL Commands
Engine
Query
Evaluation Engine
2
Learning Objectives
Data Storage
How is data physically stored?
How does data storage affect performance?
Indexing
Index types
How are they used?
How are they maintained?
How do they improve data processing performance?
3
Alternative File Organizations
6
Book Index
A database index is similar to a book index!
Book index: lists important terms in alphabetical order
with a list of page number(s) where the term appears
Searching in a book:
1. Search the book index for a term to find a list of
addresses (i.e., page numbers)
7
Indexes
9
Example index on GPA
10
Indexes Alternatives
Alternative 1:
If this is used, index structure is a file
organization for data records (instead of a
Heap file or sorted file)
At most one index on a given collection of data
records can use Alternative 1. (Otherwise,
data records are duplicated, leading to
redundant storage and potential inconsistency)
Indexes Alternatives Cont.
Alternatives 2 and 3:
Data entries typically much smaller than data
records. So, better than Alternative 1 with
large data records (Portion of index structure
used to direct search, which depends on size of
data entries, is much smaller than with
Alternative 1.)
Alternative 3 more compact than Alternative 2,
but leads to variable sized data entries even if
search keys are of fixed length
Indexes Classifications
CLUSTERED UNCLUSTERED
Index entries
direct search for
data entries
Multi-level Indexes
16
Primary Index
Defined on an ordered data file
17
Primary Index
Example: entry 1 = <K(1), P(1)>
<K(1) = “Aaron, Ed”, P(1)= block 1>
18
Primary Index
Example: entry 1 = <K(1), P(1)>
<K(1) = “Aaron, Ed”, P(1)= block 1>
19
Primary Index
Example: entry 1 = <K(1), P(1)>
<K(1) = “Aaron, Ed”, P(1)= block 1>
20
Index File
<K(i), P(i)> entries
Entry
21
An index file might be
stored in multiple blocks!
22
Primary Index
One index entry for each block in the data file
The index entry has the key field value for the first
record in the block (block anchor)
“Create an index!”
But does a primary index make queries run faster?!
23
24
Create Index!
25
Create Index!
Advantages of a Primary Index
A primary index might be stored in multiple blocks, but:
it occupies much smaller space than a data file, because:
1. There are fewer index entries than records
2. Each index entry is typically smaller than a data record
Index record only 2 fields
More index entries than data records fit in a block
Binary search is more efficient on index file!
Let size of data file = Bdata blocks
Let size of index file = Bindex blocks
Typically: Bindex << Bdata (much smaller)
Then: Log2 Bindex < log2 Bdata
26