Indexing

Indexing is a database optimization technique that minimizes disk access by using data structures to quickly locate data, with primary and secondary indexing as the main methods. Primary indexing can be dense or sparse, while secondary indexing introduces multiple levels to reduce mapping size. Clustering indexes group non-unique columns for faster identification, and B+ trees are a structure used for efficient data storage and retrieval, supporting operations like insertion and deletion while maintaining balance.

Uploaded by

Tejas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Indexing

Uploaded by

Tejas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Indexing

Indexing is a way to optimize the performance of a database by minimizing the number of disk
accesses required when a query is processed. It is a data structure technique which is used to
quickly locate and access the data in a database.
Indexes are created using a few database columns.
• The first column is the Search key that contains a copy of the primary key or candidate key of
the table. These values are stored in sorted order so that the corresponding data can be
accessed quickly.
Note: The data may or may not be stored in sorted order.
• The second column is the Data Reference or Pointer which contains a set of pointers holding
the address of the disk block where that particular key value can be found.

From <https://www.geeksforgeeks.org/indexing-in-databases-set-1/>

Types of Indexing

From <https://www.guru99.com/indexing-in-database.html>

Indexing in Database is defined based on its indexing attributes. Two main types of
indexing methods are:
• Primary Indexing
• Secondary Indexing
Primary Index
Primary Index is an ordered file which is fixed length size with two fields. The first field is
the same a primary key and second, filed is pointed to that specific data block. In the
primary Index, there is always one to one relationship between the entries in the index
table.
The primary Indexing in DBMS is also further divided into two types.
• Dense Index
• Sparse Index

From <https://www.guru99.com/indexing-in-database.html>

• Dense Index:
• For every search key value in the data file, there is an index record.
• This record contains the search key and also a reference to the first data record with that
search key value.

RDBMS Page 1
• Dense Index:
• For every search key value in the data file, there is an index record.
• This record contains the search key and also a reference to the first data record with that
search key value.

Though it addresses quick search on any search key, the space used for index and address becomes overhead in the
memory. Here the (index, address) becomes almost same as (table records, address). Hence more space is consumed to
store the indexes as the record size increases.

• Sparse Index:
• The index record appears only for a few items in the data file. Each item points to a block as
shown.
• To locate a record, we find the index record with the largest search key value less than or equal
to the search key value we are looking for.
• We start at that record pointed to by the index record, and proceed along with the pointers in
the file (that is, sequentially) until we find the desired record.

From <https://www.geeksforgeeks.org/indexing-in-databases-set-1/>

But if we have very huge table, then if we provide very large range between the columns will not work. We will have to
divide the column ranges considerably shorter. In this situation, (index, address) mapping file size grows like we have
seen in the dense indexing.

Secondary Index
n this method, another level of indexing is introduced to reduce the (index, address) mapping size. That means initially
huge range for the columns are selected so that first level of mapping size is small. Then each range is further divided
into smaller ranges. First level of mapping is stored in the primary memory so that address fetch is faster. Secondary
level of mapping and the actual data are stored in the secondary memory – hard disk.

RDBMS Page 2
we can see that columns are divided into groups of 100s first. These groups are stored in the primary
memory. In the secondary memory, these groups are further divided into sub-groups. Actual data
records are then stored in the physical memory. We can notice that, address index in the first level is
pointing to the first address in the secondary level and each secondary index addresses are pointing
to the first address in the data block. If we have to search any data in between these values, then it
will search the corresponding address from first and second level respectively. Then it will go to the
address in the data blocks and perform linear search to get the data.
For example, if it has to search 111 in the above diagram example, it will search the max (111) <=
111 in the first level index. It will get 100 at this level. Then in the secondary index level, again it does
max (111) <= 111, and gets 110. Now it goes to data block with address 110 and starts searching
each record till it gets 111. This is how a search is done in this method. Inserting/deleting/updating is
also done in same manner.

Multilevel Indexing
In this method, we can see that index mapping growth is reduced to considerable amount. But this
method can also have same problem as the table size increases. In order to overcome this, we can
introduce multiple levels between primary memory and secondary memory. This method is also
known as multilevel indexing. In this method number of secondary level index is two or more.

Clustering Index
In some cases, the index is created on non-primary key columns which may not be unique for each
record. In such cases, in order to identify the records faster, we will group two or more columns
together to get the unique values and create index out of them. This method is known as clustering
index. Basically, records with similar characteristics are grouped together and indexes are created for
these groups.
For example, students studying in each semester are grouped together. i.e.; 1 st Semester students,
2nd semester students, 3rd semester students etc are grouped.

In above diagram we can see that, indexes are created for each semester in the index file. In the data
block, the students of each semester are grouped together to form the cluster. The address in the
index file points to the beginning of each cluster. In the data blocks, requested student ID is then
search in sequentially.
New records are inserted into the clusters based on their group. In above case, if a new student joins
3rd semester, then his record is inserted into the semester 3 cluster in the secondary memory. Same
is done with update and delete.
If there is short of memory in any cluster, new data blocks are added to that cluster.

RDBMS Page 3
This method of file organization is better compared to other methods as it provides clean distribution
of records, and hence making search easier and faster. But in each cluster, there would be unused
space left. Hence it will take more memory compared to other methods.

---------------------------------------------------------------------------------------------------------------------------------------
-
Introduction to B+ Trees

B+ tree has one root, any number of intermediary nodes (usually one) and a leaf node. Here all leaf
nodes will have the actual records stored. Intermediary nodes will have only pointers to the leaf
nodes; it not has any data. Any node will have only two leaves. This is the basic of any B+ tree.
Consider the STUDENT table below. This can be stored in B+ tree structure as shown below. We can
observe here that it divides the records into two and splits into left node and right node. Left node will
have all the values less than or equal to root node and the right node will have values greater than
root node. The intermediary nodes at level 2 will have only the pointers to the leaf nodes. The values
shown in the intermediary nodes are only the pointers to next level. All the leaf nodes will have the
actual records in a sorted order.

RDBMS Page 4
Insertion in B+ tree
Suppose we have to insert a record 60 in below structure. It will go to 3 rd leaf node after 55. Since it is
a balanced tree and that leaf node is already full, we cannot insert the record there. But it should be
inserted there without affecting the fill factor, balance and order. So the only option here is to split the
leaf node. But how do we split the nodes?

The 3rd leaf node should have values (50, 55, 60, 65, 70) and its current root node is 50. We will split
the leaf node in the middle so that its balance is not altered. So we can group (50, 55) and (60, 65,
70) into 2 leaf nodes. If these two has to be leaf nodes, the intermediary node cannot branch from 50.
It should have 60 added to it and then we can have pointers to new leaf node.

This is how we insert a new entry when there is overflow. In normal scenario, it is simple to find the
node where it fits and place it in that leaf node.

Delete in B+ tree
Suppose we have to delete 60 from the above example. What will happen in this case? We have to
remove 60 from 4th leaf node as well as from the intermediary node too. If we remove it from
intermediary node, the tree will not satisfy B+ tree rules. So we need to modify it have a balanced
tree. After deleting 60 from above B+ tree and re-arranging nodes, it will appear as below.

Suppose we have to delete 15 from above tree. We will traverse to the 1 st leaf node and simply delete
15 from that node. There is no need for any re-arrangement as the tree is balanced and 15 do not
appear in the intermediary node.

RDBMS Page 5
RDBMS Page 6

AIM Historian Users Guide
100% (3)
AIM Historian Users Guide
344 pages
DMT Quick Start Guide
No ratings yet
DMT Quick Start Guide
11 pages
Ip Storage Networking Straight To The Core
No ratings yet
Ip Storage Networking Straight To The Core
500 pages
Unit 3 Storage Strategies Indices B-Trees Hashing
No ratings yet
Unit 3 Storage Strategies Indices B-Trees Hashing
12 pages
Indexing
No ratings yet
Indexing
10 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
12 pages
Indexing and Hashing
No ratings yet
Indexing and Hashing
20 pages
Indexing_Hashing_Files
No ratings yet
Indexing_Hashing_Files
68 pages
Unit 6 notes DBMS final
No ratings yet
Unit 6 notes DBMS final
14 pages
PPT-203105251-3
No ratings yet
PPT-203105251-3
35 pages
Indexing
No ratings yet
Indexing
6 pages
Unit-6 Storage Strategies
No ratings yet
Unit-6 Storage Strategies
43 pages
DBMS - Indexing: Dense Index
No ratings yet
DBMS - Indexing: Dense Index
5 pages
Index Architecture: Febriliyan Samopa
No ratings yet
Index Architecture: Febriliyan Samopa
110 pages
What Is Indexing?: Indexing Is A Data Structure Technique Which Allows You To Quickly Retrieve
100% (1)
What Is Indexing?: Indexing Is A Data Structure Technique Which Allows You To Quickly Retrieve
7 pages
Unit Iv Indexing and Hashing: Basic Concepts
No ratings yet
Unit Iv Indexing and Hashing: Basic Concepts
35 pages
Indexing - DBMS
No ratings yet
Indexing - DBMS
20 pages
Indexing - II
No ratings yet
Indexing - II
57 pages
Dbms Indexing
No ratings yet
Dbms Indexing
3 pages
CSE 544: Lecture 11 Storing Data, Indexes: Monday, 5/1/2006
No ratings yet
CSE 544: Lecture 11 Storing Data, Indexes: Monday, 5/1/2006
52 pages
Indexing
No ratings yet
Indexing
6 pages
Chapter 11: Indexing and Hashing
No ratings yet
Chapter 11: Indexing and Hashing
47 pages
03 UW Indexing (1)
No ratings yet
03 UW Indexing (1)
97 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
15 pages
Index and Hashing
No ratings yet
Index and Hashing
82 pages
Storage System - RAID Levels
No ratings yet
Storage System - RAID Levels
53 pages
Primary Indexing
No ratings yet
Primary Indexing
7 pages
INDEXING
No ratings yet
INDEXING
10 pages
CIT 401 Lecture Note
No ratings yet
CIT 401 Lecture Note
46 pages
Co3 Session 21
No ratings yet
Co3 Session 21
53 pages
Indexing and B+ Tress
No ratings yet
Indexing and B+ Tress
6 pages
Indexing
No ratings yet
Indexing
8 pages
DBMS Indexing Methods
No ratings yet
DBMS Indexing Methods
33 pages
Memoryhierarchy Indexing
No ratings yet
Memoryhierarchy Indexing
9 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
4 pages
IT3020 L06 Indexing
No ratings yet
IT3020 L06 Indexing
41 pages
DBMS Indexing B - Tree To B Tree (197222, 197125, 197155)
No ratings yet
DBMS Indexing B - Tree To B Tree (197222, 197125, 197155)
41 pages
Dbms Mod3
No ratings yet
Dbms Mod3
54 pages
CSE 301 Lecture-8-Indexing WT
No ratings yet
CSE 301 Lecture-8-Indexing WT
31 pages
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
No ratings yet
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
7 pages
sqlIndexes2
No ratings yet
sqlIndexes2
10 pages
CH 12 Updated
No ratings yet
CH 12 Updated
55 pages
Unit-4 Hand Written
No ratings yet
Unit-4 Hand Written
35 pages
Indexing
No ratings yet
Indexing
6 pages
Black Elegant and Modern Startup Pitch Deck Presentation (1)
No ratings yet
Black Elegant and Modern Startup Pitch Deck Presentation (1)
16 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
No ratings yet
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
44 pages
7 Indexing
No ratings yet
7 Indexing
13 pages
Data Indexing Presentation
No ratings yet
Data Indexing Presentation
38 pages
Unit_6
No ratings yet
Unit_6
38 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
14-PhysicalAccess
No ratings yet
14-PhysicalAccess
41 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Indexing: Contents
No ratings yet
Indexing: Contents
13 pages
DBMS Unit 5 Notes
No ratings yet
DBMS Unit 5 Notes
23 pages
Unit 4 Index Structures For Files: Structure
No ratings yet
Unit 4 Index Structures For Files: Structure
16 pages
UNIT-5: Indexing and Hashing
No ratings yet
UNIT-5: Indexing and Hashing
78 pages
Unit -5 - part 2
No ratings yet
Unit -5 - part 2
33 pages
Exam Notes COA
No ratings yet
Exam Notes COA
36 pages
Ch14, Veiws, Normalization_summary.pptx
No ratings yet
Ch14, Veiws, Normalization_summary.pptx
68 pages
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
ADVANCED DATA STRUCTURES FOR ALGORITHMS: Mastering Complex Data Structures for Algorithmic Problem-Solving (2024)
From Everand
ADVANCED DATA STRUCTURES FOR ALGORITHMS: Mastering Complex Data Structures for Algorithmic Problem-Solving (2024)
VIOLET CASTRO
No ratings yet
HCM Data Loader Users Guide R10
No ratings yet
HCM Data Loader Users Guide R10
71 pages
MCSL 025 (P) S3
No ratings yet
MCSL 025 (P) S3
2 pages
h8224 Replication Isilon Synciq WP
No ratings yet
h8224 Replication Isilon Synciq WP
94 pages
Final Exam Sem 1 - 2
No ratings yet
Final Exam Sem 1 - 2
15 pages
Soa Admin Faq'S - With Ans
No ratings yet
Soa Admin Faq'S - With Ans
125 pages
PMD 150 PDF
No ratings yet
PMD 150 PDF
116 pages
File Organization and Access Methods
No ratings yet
File Organization and Access Methods
6 pages
Hibernate Multiple Databases: Example
No ratings yet
Hibernate Multiple Databases: Example
7 pages
DS NOTES Unit 4 PDF
No ratings yet
DS NOTES Unit 4 PDF
36 pages
Java Database Connectivity
No ratings yet
Java Database Connectivity
6 pages
SQL Joins Interview Questions: Click Here
No ratings yet
SQL Joins Interview Questions: Click Here
34 pages
Butcher Gametech04 PDF
No ratings yet
Butcher Gametech04 PDF
40 pages
Computer Basics Definiton
100% (1)
Computer Basics Definiton
8 pages
CM18 Sis 0035 1 05
100% (1)
CM18 Sis 0035 1 05
154 pages
Readme
No ratings yet
Readme
18 pages
Evidence Collection and Legal Challenges in Forensic Accounting - PPT
No ratings yet
Evidence Collection and Legal Challenges in Forensic Accounting - PPT
14 pages
FTPRegisterLogoUtilityManual ENG V100
No ratings yet
FTPRegisterLogoUtilityManual ENG V100
11 pages
5900 MRF Overview
100% (1)
5900 MRF Overview
76 pages
Consultas ASM
No ratings yet
Consultas ASM
4 pages
4-Data Manipulation and Querying
No ratings yet
4-Data Manipulation and Querying
14 pages
IBM Storage Scale and Storage Scale Server Level 2 Quiz - Attempt Review
No ratings yet
IBM Storage Scale and Storage Scale Server Level 2 Quiz - Attempt Review
13 pages
SQL Cheatsheet: Icbc Road Test
No ratings yet
SQL Cheatsheet: Icbc Road Test
3 pages
Multi Cycle PDF
No ratings yet
Multi Cycle PDF
16 pages
Week9 - Lecture1 Chapter - 3
No ratings yet
Week9 - Lecture1 Chapter - 3
28 pages
CDR Collector /charging Gateway: Alcatel 2002 3AT 06724 AAAA DEZZA V01
No ratings yet
CDR Collector /charging Gateway: Alcatel 2002 3AT 06724 AAAA DEZZA V01
26 pages
Heap Memory Project
No ratings yet
Heap Memory Project
4 pages
Velammal Vidyalaya Practical Programs-Term2 Class XII-Computer Science 2022-2023
No ratings yet
Velammal Vidyalaya Practical Programs-Term2 Class XII-Computer Science 2022-2023
16 pages

Indexing

Uploaded by

Indexing

Uploaded by

Indexing

You might also like