0% found this document useful (0 votes)

35 views37 pages

Indexing Lecture Nov 2023 Detailed

The document discusses different types of database indexes including primary indexes, clustering indexes, and secondary indexes. It explains that primary indexes index the primary key of a table, clustering indexes physically organize data based on the index value, and secondary indexes index non-key fields. The document also covers performance characteristics and implementation details of each index type.

Uploaded by

mccreary.michael95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views37 pages

Indexing Lecture Nov 2023 Detailed

Uploaded by

mccreary.michael95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Advanced

Databases
Indexing
Dr David Hamill
Overview

Introduction

Single-Level Ordered Indexes

• Primary Index
• Secondary Index (Non-Clustered)
• Clustering Index

Multi-Level Indexes

B-Trees and B+-Trees

Introduction
• We need more efficient ways of using indexes to retrieve data.
• So far, we have seen access structures concerned with how records
are organized in files and what access methods can be used based on
that structure:
• Heap Files
• No specific structure
• Use of sequential (linear) access to records
• Hash Files
• Uses hash function of a set of hash fields
• Allows direct access if has fields are known
• An Index is a data structure that allows a DBMS to locate particular
records in a file in less time.
• This results in faster responses to user queries.
• It can speed up retrieval of records if certain requirements on the
searching conditions are met.
• A database index is similar to an index in a
book or catalogue in a library:
• An author index
• A title index

• Each index offers an access path to records:

• No need to scan sequentially through a
file!
• An index is ordered and each index entry
contains the item required or one or
more locations where the item can be
found.
• An Index access structure is associated with a particular search key
and contains records consisting of a key value and the address of the
logical record in the file containing the key value.
• Values in the index are usually sorted/ordered according to the
indexing field (often based on a single attribute).
• When an index is ordered we can perform efficient binary search on
the index.
• It is possible to have more than one index field.
Data & Index Files
• There are 2 types of files:
• Data files – files containing the logical records.
• Index files – files containing the index records.

• We will look at the following types of indexes:

• Primary indexes
• Clustering indexes
• Secondary indexes
• Multilevel indexes
• B-tree and B+-tree structures
• Each indexing type have their own advantages and disadvantages.
• The following characteristics are taken into account:
• Access Types: the access methods that can be supported efficiently (value-
based search, range-based search)
• Access Time: time needed to locate the result set
• Insertion/deletion efficiency: how fast can we complete insertions and
deletions.
• Storage overhead: the additional storage requirements in an index structure.
Index types

Single-Level Ordered Indexes

• Primary Index
• Secondary Index (Non-clustered)
• Clustered Index

Multi-Level Indexes

B-Trees and B+-Trees

Single-Level Ordered Indexes
• Primary Index: if data is sequentially ordered and
the indexing field is a key field to the file
(guaranteed to be unique) then we call it a
primary index.

• Clustering Index: if the data file is sequentially

ordered on a non-key field and the indexing field
corresponds to a non-key field, then the index is a
clustering index.
Single-Level Ordered Indexes
• Secondary Index: An index that is defined on a non-
ordering field of the data file.

• A file can have at most one physical ordering field.

• A file can have at most one primary index or clustering
index, but not both.
• A file can have several secondary indexes.
• Secondary indexes do not affect the physical
organization of records.
Single-Level Ordered Indexes
• An index can be sparse or dense:
• A sparse index has an index record for some of the search key
values in the file.
• A dense index has an index record for every search key value in
the file.
• A primary index is built for a data file sorted on its key field.
• The index file is a sorted file whose records are fixed in length consisting
of two fields:
• The first field is the same data type as the ordering key field of the
data file.
• The second field is a pointer to a disk block.
• The ordering key field is called the primary key of the data file.
• There is one entry for each block in the data file.
Primary
Indexes

1:1 index file to data file is intuitive but wasteful

Primary Index -Example

Index File, sorted Data File, sorted

Distinct values
Primary Index -Performance
• The index file requires significantly fewer blocks than the data file
• Sparse index
• Index file record typically smaller in size than data file record

• A binary search on the index file requires fewer block accesses than a
binary search on the data file

• Insertion and deletion of records is problematic

• Not only do we have to move records in the data file we also have to change
some index entries

• Storage Overhead is not a serious problem

Clustering Indexes

• Clustering indexing is a database indexing technique that is used to

physically arrange the data in a table based on the values of the clustered
index key. This means that the rows in the table are stored on disk in the
same order as the clustered index key
• The leaf nodes of a clustered index contain the data pages.
• A clustered index is faster. A non-clustered index is slower. The clustered
index requires less memory for operations. A non-Clustered index requires
more memory for operations.
• A clustered index is most useful for columns that have range predicates
because it allows better sequential access of data in the table. As a result,
since like values are on the same data page, fewer pages are fetched.
Clustering Indexes

• A clustering index is built for a data file sorted on a non-key field.

• The index file is another sorted file whose records are fixed length
consisting of two fields.
• First field is of the same data type as the clustering field of the data file.
• Second field is a pointer to a disk block
• There is one entry in the clustering index for each distinct value of the
clustering field containing the value and a pointer to the first block in
the data file that holds at least one record with the value of the
clustering field.
Clustering Indexes

Multiple entries

Distinct values
Clustering Indexes Performance
• Index file requires significantly fewer blocks than the data file.
• Sparse index
• Index file record typically smaller than data file record.
• A binary search on the index file requires fewer block accesses than a
binary search on the data file.
• Insertion and deletion of records is problematic:
• We have to move records in the data file and we have to change some index
entries.
• Common to reserve a whole block for each distinct value of the clustering
field with all records with that value placed in the block.
• Storage overhead is not typically a serious problem.
Secondary (Non-Clustered) Indexes

• A secondary index is built for a non-ordering field of a data file.

• The index file is itself a sorted file whose records are fixed or variable
length consisting of two fields.
• The first field is the same data type as the indexing field.
• The second field is a pointer to a disk block for a record.

• We can consider two types of secondary indexes:

• Case 1: Using a dense secondary index that maps to all records in the data
file.
• Case 2: Using a secondary index that has an entry for each distinct key value
but whose pointers can be multivalued or point to a bucket of values.
• A value in the index field is not necessarily an ordering field of the
data file.
• When indexing field is not an ordering field, we construct a secondary
index on it where the index field can also be called a secondary key.
• There is one entry in the index file for each entry in the data file.
• Each entry contains the value of the secondary key for the
record and a pointer to either the block where the entry is
Secondary stored or the record itself.
• There may be duplicate values in the index field.

Indexes
(Case 1)
• A secondary index is a dense index since there is
one entry for every record in the data file.
• A binary search can be performed on the index.
• A secondary index usually needs more storage
space and longer search times because of the large
Secondary number of entries.

Indexes
(Case 1)
• Many records in the data file have the same value
for the indexing field.
• Several options are available for implementing such
an index:
1. User variable length records to hold an array of block
pointers associated with the indexing field value.
Secondary 2. User a single entry for each indexing field value. Create
extra level of redirection to handle multiple pointers.
Indexes
(Case 2)
Secondary Indexes
Secondary Indexes - summary
Index Type Number of Index Dense / Use Block Anchor
Entries Sparse
Primary Equal to the number of Sparse Yes
blocks in the data file
Clustering Equal to the number of Sparse Yes if separate blocks
distinct indexing field are used for records
values with different
indexing field values.
No otherwise
Secondary Equal to the number of Dense for
records for Case 1. Case 1.
Equal to the number of Sparse for
distinct indexing field Case 2.
values for Case 2.
Clustered v Non-Clustered

1. Difference 1: Only one clustered index per table. You can create multiple non-
clustered indexes in a single table
2. Difference 2: Clustered indexes only sort tables. Therefore, they do not
consume extra storage. Non-clustered indexes are stored in a separate place
from the actual table claiming more storage space.
3. Difference 3: Clustered indexes are faster than Non-clustered indexes since they
don’t involve any extra lookup step.
Multi-Level Indexes
• When an index file becomes large and extends over many pages, the search
time for the required index increases

• A multi-level index attempts to overcome this problem by reducing the search

range
• Treat the index like any other file
• Split the index into a number of smaller indexes
• Maintain an index to the indexes
Multi-Level
Indexes
• Multilevel indexes refer to a hierarchical
structure of indexes.

• Here, each level of the index provides a

more detailed reference to the data.

• It allows faster data retrieval, reduces

disk access, and improves query
performance.
Multi-Level Indexes - Performance
• Search performance increases when searching for a record based
on a specific indexing field value.
• Problems with insertions and deletions are still present.
• To retain the benefits of using multi-level indexing while reducing
insertion and deletion problems, an approach is taken that
leaves some space in each block for inserting new entries.
• This is called dynamic multi-level index and is often
implemented using a data structure called a balanced tree (B-
trees and B+-trees).
Tree Data Structure
Root Node

Child Node
Level 0 A

Level 1 B C

Level 2 D E F
Tree Data Structure
• The depth of a tree is the maximum number of levels between
the root node and a leaf node in the tree.
• If the depth from the root node to the leaf node is the same to each
leaf we have produced a balanced tree or B-Tree.
• The degree (or order) of a tree is the maximum number of
children allowed per parent.
• One more than the maximum number of key values per node.
• The access time of a tree depends on the depth rather than the breadth
of the tree. For this reason, it is better for it to be a leafy shallow tree.
• When a node reaches a maximum size, the median is promoted to a
higher node and the left and right sub-trees are split surrounding the
median.
• A special type of tree used to guide the search for a record.
• Multi-Level indexes can be considered a variation of search trees.
• Each block of entries is called a node.
• A node can have a certain number of pointers and a certain number of key
values.
• The index field values in each node guides us to the next node until we
reach the data block containing our required record.
• Using a pointer, we restrict our search at each sub-level to a sub-tree of the
search tree and can ignore all other nodes that are not in the sub-tree.

Search Trees
Tree Data Structure

• See example of constructing a b-tree in class:

Construct a b-tree of order 5 containing the following keys:

1 12 8 2 25 6 14 28 17 7 52 16 48 68 3 26 29 53 55 45
Tree Data Structure
• The main difference between a B-Tree and B+Tree is that a B+Tree
does not allow storage of indexes at anywhere else other than the
leaves.
• Internal nodes in a B-Tree can contain indexes and pointers to other
indexes.
B-Tree vs B+Tree Performance
• Search often takes more time in B+-Trees vs B-Trees because keys are
not just available on the leaves.
• B+Tree can maintain duplicates.
• Insertions in B-Tree takes more time. B+Tree insertion always takes
the same time.
• Deletion of B-tree node is complex. Deletion in B+Tree is easy because
all indexes are at the leaves.
• B-Tree has no redundant search keys but the B+Tree may have
redundant search keys
Disadvantages of Indexing
• To perform the indexing database management system, you need a
primary key on the table with a unique value.
• You can’t perform any other indexes in Database on the Indexed data.
• You are not allowed to partition an index-organized table.
• SQL Indexing Decrease performance in INSERT, DELETE, and UPDATE
query.
Summary

Primary Index

Single-Level
Secondary Index
Ordered Indexes

Multi-Level
Clustering Index
Indexes

B-Trees and B+-

Trees

Alternative Energy Demystified (S.Gibilisco)
No ratings yet
Alternative Energy Demystified (S.Gibilisco)
338 pages
418 CUMMINS 6CTA8.3-C215 Dongfeng Part Catalogue
100% (1)
418 CUMMINS 6CTA8.3-C215 Dongfeng Part Catalogue
84 pages
Pile Type 1 - Screw Pile Load Test Outline (Terna)
No ratings yet
Pile Type 1 - Screw Pile Load Test Outline (Terna)
113 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
12 pages
Indexes
No ratings yet
Indexes
70 pages
CIT 401 Lecture Note
No ratings yet
CIT 401 Lecture Note
46 pages
8000 Series C Programming Guide Part 1
No ratings yet
8000 Series C Programming Guide Part 1
362 pages
Hysys Course 2012
100% (2)
Hysys Course 2012
71 pages
Office Automation
No ratings yet
Office Automation
14 pages
Co2 - Index in DBMS 1
No ratings yet
Co2 - Index in DBMS 1
29 pages
Indexing
No ratings yet
Indexing
89 pages
Indexes
No ratings yet
Indexes
4 pages
Chapter - 3 - Indexing Structures For Files
No ratings yet
Chapter - 3 - Indexing Structures For Files
83 pages
Lec20Indexing v1
No ratings yet
Lec20Indexing v1
57 pages
DBMS Unit9
No ratings yet
DBMS Unit9
44 pages
Co3 Session 21
No ratings yet
Co3 Session 21
53 pages
I-V Curves Report - Template
No ratings yet
I-V Curves Report - Template
8 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
33 pages
Indexing in Database
No ratings yet
Indexing in Database
33 pages
Week 15 Physical Database Design Index - CH 17 Updated
No ratings yet
Week 15 Physical Database Design Index - CH 17 Updated
35 pages
Cinematography: Lighting
88% (24)
Cinematography: Lighting
77 pages
CO3-Session-09 & 10
No ratings yet
CO3-Session-09 & 10
41 pages
Indexing
No ratings yet
Indexing
53 pages
Indexing
No ratings yet
Indexing
41 pages
Permeability Determination From Stoneley Waves in
No ratings yet
Permeability Determination From Stoneley Waves in
18 pages
Chapter 3
No ratings yet
Chapter 3
50 pages
Dbms Mod3
No ratings yet
Dbms Mod3
54 pages
Indexing and Hashing: Basic Concept, Ordered Indices: Adbms
No ratings yet
Indexing and Hashing: Basic Concept, Ordered Indices: Adbms
22 pages
Indexing
No ratings yet
Indexing
62 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
Agilent 54622D Oscilloscope Service
No ratings yet
Agilent 54622D Oscilloscope Service
118 pages
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-02 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-02 Reference-Material-I
48 pages
Extraction of Piperine From Black Pepper PDF
0% (2)
Extraction of Piperine From Black Pepper PDF
2 pages
Screenshot 2025-03-12 at 9.41.04 AM
No ratings yet
Screenshot 2025-03-12 at 9.41.04 AM
41 pages
Indexing - II
No ratings yet
Indexing - II
57 pages
Indexing Structures For Files
No ratings yet
Indexing Structures For Files
23 pages
File Organization and Indexing
No ratings yet
File Organization and Indexing
38 pages
Index and Hashing 2017 Combined
No ratings yet
Index and Hashing 2017 Combined
60 pages
Unit-III Final Java Servlets and XML Notes
No ratings yet
Unit-III Final Java Servlets and XML Notes
64 pages
04 02 Permutation and Combinations2 PDF
No ratings yet
04 02 Permutation and Combinations2 PDF
28 pages
Index Structures
No ratings yet
Index Structures
34 pages
Indexing
No ratings yet
Indexing
27 pages
XpressBees ReverseReattemptDate CustomerAlternateAddress MobileUpdationAPI
No ratings yet
XpressBees ReverseReattemptDate CustomerAlternateAddress MobileUpdationAPI
5 pages
Chapter 3 File Organization Indexed Methods
No ratings yet
Chapter 3 File Organization Indexed Methods
31 pages
SS3 Term 1
No ratings yet
SS3 Term 1
18 pages
7-Indexing and Block
No ratings yet
7-Indexing and Block
20 pages
Index 1
No ratings yet
Index 1
25 pages
Indexing Lecture Nov 2023 Summary
No ratings yet
Indexing Lecture Nov 2023 Summary
41 pages
UGC NET Paper 1 16 June 2023 Morning Shift
No ratings yet
UGC NET Paper 1 16 June 2023 Morning Shift
40 pages
Scopa Rules
No ratings yet
Scopa Rules
2 pages
Unit 9 Vocabulary
No ratings yet
Unit 9 Vocabulary
34 pages
Indexing Structures For Files
No ratings yet
Indexing Structures For Files
25 pages
Module 4 Indexing
No ratings yet
Module 4 Indexing
20 pages
CNG351 Lecture 12 A
No ratings yet
CNG351 Lecture 12 A
21 pages
FALLSEM2019-20 ITE1003 ETH VL2019201002592 Reference Material I 06-Nov-2019 Indexing
No ratings yet
FALLSEM2019-20 ITE1003 ETH VL2019201002592 Reference Material I 06-Nov-2019 Indexing
32 pages
M12 Indexing in DBMS
No ratings yet
M12 Indexing in DBMS
18 pages
Introduction To Indexing in Database Management Systems Print
No ratings yet
Introduction To Indexing in Database Management Systems Print
12 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
15 pages
CNG351 Lecture 12 A
No ratings yet
CNG351 Lecture 12 A
21 pages
Chapter - 2 - Revision
No ratings yet
Chapter - 2 - Revision
26 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
CLP 02.2 Course Title: Microprocessors & Microcontrollers Lab
No ratings yet
CLP 02.2 Course Title: Microprocessors & Microcontrollers Lab
6 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Primary Indexing
No ratings yet
Primary Indexing
7 pages
Indexing
No ratings yet
Indexing
6 pages
White Paper Droplet Based Microfluidics Elveflow Microfluidics
No ratings yet
White Paper Droplet Based Microfluidics Elveflow Microfluidics
28 pages
Sub-Surface Understanding of An Oil Field in Cambay Basin
No ratings yet
Sub-Surface Understanding of An Oil Field in Cambay Basin
9 pages
CO3 Notes Indexing
No ratings yet
CO3 Notes Indexing
11 pages
File Organization and Indexing
No ratings yet
File Organization and Indexing
13 pages
SingleLevelIndexing Examples
No ratings yet
SingleLevelIndexing Examples
24 pages
Indexing Structures For Files: Database Design Database Design
No ratings yet
Indexing Structures For Files: Database Design Database Design
9 pages
Module-5 Dbms Cs208 Notes
No ratings yet
Module-5 Dbms Cs208 Notes
11 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
5 pages
GLB Earn Proration Anytime
No ratings yet
GLB Earn Proration Anytime
11 pages
Cantiliver Retaing Wall
No ratings yet
Cantiliver Retaing Wall
14 pages
Lesson 4 - Indexing
No ratings yet
Lesson 4 - Indexing
6 pages
2017 H2 Prelim (Maclaurin and Binomial Series)
No ratings yet
2017 H2 Prelim (Maclaurin and Binomial Series)
12 pages
Indexing
No ratings yet
Indexing
6 pages
CGL Tier-1 Mock - p12
No ratings yet
CGL Tier-1 Mock - p12
1 page
Single Level Indexing
No ratings yet
Single Level Indexing
9 pages
ANSUMAN SHARMA 109ee0305 Department of ELECTRICAL Engineering, NIT Rourkela
100% (1)
ANSUMAN SHARMA 109ee0305 Department of ELECTRICAL Engineering, NIT Rourkela
1 page
CMP 312
No ratings yet
CMP 312
2 pages
Indexing
No ratings yet
Indexing
8 pages
Cryptanalysis of A New Ultralightweight RFID Authentication ProtocolSASI
No ratings yet
Cryptanalysis of A New Ultralightweight RFID Authentication ProtocolSASI
5 pages
ESC201 UDas Lec24Corrected OpAmp Aps PDF
No ratings yet
ESC201 UDas Lec24Corrected OpAmp Aps PDF
6 pages
MC 3487
No ratings yet
MC 3487
6 pages
(Reg. Relationship Steps
No ratings yet
(Reg. Relationship Steps
4 pages
Kubler - Bellows Couplings
No ratings yet
Kubler - Bellows Couplings
2 pages
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Indexing Lecture Nov 2023 Detailed

Uploaded by

Indexing Lecture Nov 2023 Detailed

Uploaded by

Advanced

Single-Level Ordered Indexes

B-Trees and B+-Trees

• Each index offers an access path to records:

• We will look at the following types of indexes:

Single-Level Ordered Indexes

B-Trees and B+-Trees

• Clustering Index: if the data file is sequentially

• A file can have at most one physical ordering field.

1:1 index file to data file is intuitive but wasteful

Index File, sorted Data File, sorted

• Insertion and deletion of records is problematic

• Storage Overhead is not a serious problem

• Clustering indexing is a database indexing technique that is used to

• A clustering index is built for a data file sorted on a non-key field.

• A secondary index is built for a non-ordering field of a data file.

• We can consider two types of secondary indexes:

• A multi-level index attempts to overcome this problem by reducing the search

• Here, each level of the index provides a

• It allows faster data retrieval, reduces

• See example of constructing a b-tree in class:

Construct a b-tree of order 5 containing the following keys:

B-Trees and B+-

You might also like