7 Indexing
7 Indexing
7. Indexing
Contents:
• Single-Level Ordered Indexes
• Multi-Level Indexes
• B+ Tree based Indexes
• Index Definition in SQL
Basic Concepts
• Index files are typically much smaller than the original file
because only the values for search key and pointer are stored.
• There are two basic types of indexes:
– Ordered indexes: Search keys are stored in a sorted order
(main focus here in class).
– Hash indexes: Search keys are distributed uniformly across
“buckets” using a hash function.
7. Indexes
ECS-165A 124
7. Indexes
ECS-165A 125
7. Indexes
ECS-165A 126
Secondary Indexes
7. Indexes
ECS-165A 127
Multi-Level Index
Index Data
block 0 block 0
Data
block 1
Index
block 1
outer index
inner index
record file
7. Indexes
ECS-165A 128
7. Indexes
ECS-165A 129
P1 K1 P2 . . . Pn 1 Kn 1 Pn
7. Indexes
ECS-165A 130
Example of a B+-Tree
S25 S70
Pn
Pn Pn Pn
Li Lj
S51 2 TID TID S55 n TID TID ....... TID S60 2 TID TID
• If Li, Lj are leaf nodes and i < j , Li’s search key values are
less than Lj ’s search key values.
• Pn points to next leaf node in search key order.
7. Indexes
ECS-165A 131
7. Indexes
ECS-165A 132
Queries on B+-Trees
Find all records with a search key value of k
• Start with the root node
– Examine the node for the smallest search key value > k.
– If such a value exists, assume it is Ki. Then follow Pi to
the child node.
– Otherwise, k Km 1, where are m pointers in the node.
Then follow Pm to the child node.
• Further comments:
– If there are V search key values in the file, the path from
the root to a leaf node is no longer than dlogdn/2e(V )e.
– In general a node has the same size as a disk block,
typically 4KB, and n ⇡ 100 (40 bytes per index entry).
– With 1, 000, 000 search key values and n = 100, at
most log50(1, 000, 000) = 4 nodes are accessed in the
lookup!
7. Indexes
ECS-165A 133
Data records
(blocks)
Data records
(blocks)
7. Indexes
ECS-165A 134
7. Indexes
ECS-165A 135
• Example:
create index city name idx on CITY(name);
7. Indexes