Indexing Lecture Nov 2023 Detailed
Indexing Lecture Nov 2023 Detailed
Databases
Indexing
Dr David Hamill
Overview
Introduction
Multi-Level Indexes
Multi-Level Indexes
Distinct values
Primary Index -Performance
• The index file requires significantly fewer blocks than the data file
• Sparse index
• Index file record typically smaller in size than data file record
• A binary search on the index file requires fewer block accesses than a
binary search on the data file
Multiple entries
Distinct values
Clustering Indexes Performance
• Index file requires significantly fewer blocks than the data file.
• Sparse index
• Index file record typically smaller than data file record.
• A binary search on the index file requires fewer block accesses than a
binary search on the data file.
• Insertion and deletion of records is problematic:
• We have to move records in the data file and we have to change some index
entries.
• Common to reserve a whole block for each distinct value of the clustering
field with all records with that value placed in the block.
• Storage overhead is not typically a serious problem.
Secondary (Non-Clustered) Indexes
Indexes
(Case 1)
• A secondary index is a dense index since there is
one entry for every record in the data file.
• A binary search can be performed on the index.
• A secondary index usually needs more storage
space and longer search times because of the large
Secondary number of entries.
Indexes
(Case 1)
• Many records in the data file have the same value
for the indexing field.
• Several options are available for implementing such
an index:
1. User variable length records to hold an array of block
pointers associated with the indexing field value.
Secondary 2. User a single entry for each indexing field value. Create
extra level of redirection to handle multiple pointers.
Indexes
(Case 2)
Secondary Indexes
Secondary Indexes - summary
Index Type Number of Index Dense / Use Block Anchor
Entries Sparse
Primary Equal to the number of Sparse Yes
blocks in the data file
Clustering Equal to the number of Sparse Yes if separate blocks
distinct indexing field are used for records
values with different
indexing field values.
No otherwise
Secondary Equal to the number of Dense for
records for Case 1. Case 1.
Equal to the number of Sparse for
distinct indexing field Case 2.
values for Case 2.
Clustered v Non-Clustered
1. Difference 1: Only one clustered index per table. You can create multiple non-
clustered indexes in a single table
2. Difference 2: Clustered indexes only sort tables. Therefore, they do not
consume extra storage. Non-clustered indexes are stored in a separate place
from the actual table claiming more storage space.
3. Difference 3: Clustered indexes are faster than Non-clustered indexes since they
don’t involve any extra lookup step.
Multi-Level Indexes
• When an index file becomes large and extends over many pages, the search
time for the required index increases
Child Node
Level 0 A
Level 1 B C
Level 2 D E F
Tree Data Structure
• The depth of a tree is the maximum number of levels between
the root node and a leaf node in the tree.
• If the depth from the root node to the leaf node is the same to each
leaf we have produced a balanced tree or B-Tree.
• The degree (or order) of a tree is the maximum number of
children allowed per parent.
• One more than the maximum number of key values per node.
• The access time of a tree depends on the depth rather than the breadth
of the tree. For this reason, it is better for it to be a leafy shallow tree.
• When a node reaches a maximum size, the median is promoted to a
higher node and the left and right sub-trees are split surrounding the
median.
• A special type of tree used to guide the search for a record.
• Multi-Level indexes can be considered a variation of search trees.
• Each block of entries is called a node.
• A node can have a certain number of pointers and a certain number of key
values.
• The index field values in each node guides us to the next node until we
reach the data block containing our required record.
• Using a pointer, we restrict our search at each sub-level to a sub-tree of the
search tree and can ignore all other nodes that are not in the sub-tree.
Search Trees
Tree Data Structure
Primary Index
Single-Level
Secondary Index
Ordered Indexes
Multi-Level
Clustering Index
Indexes