0% found this document useful (0 votes)

36 views7 pages

Primary Indexing

The document discusses various indexing techniques in databases, including primary, secondary, cluster, and multilevel indexing, each with distinct characteristics and use cases. Primary indexing is built on sorted data files and improves search efficiency based on primary keys, while secondary indexing aids in querying non-primary attributes. Cluster indexing organizes data physically for related records, and multilevel indexing enhances search efficiency for large datasets by creating a hierarchical structure of indexes.

Uploaded by

hp401557

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views7 pages

Primary Indexing

Uploaded by

hp401557

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Primary Indexing

A primary index is built on a sorted data file and is associated with the primary key of the table. It
creates a single-level index where each index entry points to a block of the data file. Primary indexes
ensure faster access to records when the primary key is used for searching. There are two main types
of primary indexes:

1. Dense Index:

o In a dense index, there is an index entry for every record in the data file.

o Each index entry holds the primary key and a pointer to the exact location of the
corresponding record.

o While dense indexes provide faster lookups because every record has a direct entry,
they consume more space.

Example: If we have a table with 1000 records and a dense primary index, the index will have 1000
entries.

2. Sparse Index:

o In a sparse index, index entries exist for only some of the records, typically the first
record in every block of the data file.

o Each entry in the sparse index points to a block rather than an individual record.

o Sparse indexes are more space-efficient compared to dense indexes but require
more time to locate specific records.

Example: For a table of 1000 records stored across 10 blocks, a sparse index might have only 10
entries, one for each block.

Advantages of Primary Indexing:

 Faster search times when querying by the primary key.

 Index file is much smaller than the original data file.

Disadvantages:

 Only useful when searching based on the primary key.

 Maintaining the index can become cumbersome if the data file is updated frequently, as
every insertion or deletion may require reorganization of the index.
Secondary Indexing

A secondary index is created on non-primary attributes or non-sorted data. It helps to speed up

queries that do not use the primary key. The data file does not have to be sorted based on the
secondary index, and it is used to provide efficient access to records based on attributes other than
the primary key.

1. Dense Secondary Index:

o Similar to the dense primary index, it has an index entry for every record in the table.

o Each entry contains the value of the secondary key and a pointer to the record in the
data file.

o Use Case: If you frequently query the table using a non-primary key, a dense
secondary index can help speed up access.

Example: If we have a table of employees, and we frequently query based on the "Department"
column (which is not the primary key), creating a dense secondary index on "Department" will
provide faster query results.

2. Sparse Secondary Index:

o A sparse secondary index contains entries only for some records, and each entry
points to a block or a range of records.

o Like the sparse primary index, it is more space-efficient but less precise in terms of
exact record lookup.

Advantages of Secondary Indexing:

 Allows fast access to records based on non-primary attributes.

 Useful for tables where searches are frequently performed on multiple fields.

Disadvantages:
 Secondary indexes add additional overhead for maintenance because the index needs to be
updated when records are inserted, updated, or deleted.

 They may also result in slower inserts and updates, as multiple indexes may need to be
updated.

Cluster Indexing in DBMS

Cluster indexing is a type of index that groups rows with the same value for the indexed attribute(s)
together. Unlike primary and secondary indexing, which are typically built on individual columns
(either primary keys or other columns), cluster indexing organizes the physical storage of data in a
way that related records are stored in contiguous blocks.

Cluster indexing is most effective when there are multiple records in the table with the same value
for an attribute or a group of attributes. It helps in improving the performance of queries that
retrieve data based on the clustered attribute(s).

Key Concepts of Cluster Indexing

1. Clustering Key:

o A cluster index is based on a clustering key, which may be a single column or a

combination of columns.

o Records with the same clustering key values are stored physically close to each other
in the data file.

o The clustering key is not necessarily unique (unlike a primary key), which allows
multiple rows to share the same value.

2. Clustered Table:
o When a cluster index is created, the table is referred to as a clustered table because
the rows are physically rearranged to group records with the same values for the
clustering key.

o The physical arrangement of data helps in reducing the number of disk I/O
operations when retrieving records with the same clustering key.

3. Sparse Cluster Index:

o The index does not contain entries for every record but rather for blocks of records
that have the same clustering key values. It points to the first record of each group
(cluster) in the data file.

4. Dense Cluster Index:

o In some cases, a dense index can be used, where there is an entry for every record in
the cluster. However, typically, a sparse index is preferred because the data is already
grouped physically, so fewer index entries are required.

Example of Cluster Indexing

Consider a table of students with the following attributes: Student_ID, Department, Name, Age.

If you create a cluster index on the Department column, students belonging to the same department
will be stored together on the disk. So, all students from the "Computer Science" department will be
stored in one block or adjacent blocks, followed by students from "Mechanical Engineering," and so
on.

Logical Representation of Data After Cluster Indexing:

Student_ID Department Name Age

1 Computer Science Alice 22

2 Computer Science Bob 21

3 Mechanical Engineering Charlie 23

4 Mechanical Engineering David 24

Here, the cluster index is built on the Department column. Students are stored together based on
their department.

Advantages

When querying data that involves a range of values or multiple records with the same key (such as
fetching all students from a particular department), cluster indexing is highly efficient. Since the
records are stored together, fewer disk I/O operations are needed.

Cluster indexes are particularly useful when there are multiple rows with the same value for the
clustering key (non-unique values).

1. Reduces Data Access Time:

2. Efficient for Join Operations Disadvantages of Cluster Indexing

1. Insert/Update Overhead:
2. Less Flexibility:

3. Inefficient for Small Queries:

Multilevel Indexing in DBMS

Multilevel indexing is a technique used to improve the efficiency of searching large datasets by
creating multiple levels of indexes, similar to a hierarchical structure. When a single-level index
becomes too large to fit into memory, multilevel indexing divides it into smaller, more manageable
parts, reducing the number of disk accesses required to find a record.

How Multilevel Indexing Works

1. First-Level Index:

o The first level consists of a primary index (e.g., sparse or dense), where each entry
points to a block or set of records in the data file.

2. Second-Level Index:

o When the first-level index grows too large, a second-level index is created, which
indexes the first-level index.

o The second-level index contains entries pointing to blocks in the first-level index.

3. Higher-Level Indexes:

o If even the second-level index becomes large, a third-level index is created, and so
on, forming a hierarchical structure.

The idea is to have multiple levels of indexes so that each level fits into memory, allowing the system
to load only the relevant portion of the index for faster searches.

Benefits of Multilevel Indexing

 Reduced Disk I/O:

 Efficient for Large Datasets

 Scalable

Disadvantages

 Increased Complexity

 Higher Maintenance Cost

B-Tree Properties:

1. Balanced Structure: B-trees remain balanced, ensuring the height of the tree is logarithmic
concerning the number of nodes, leading to efficient search, insertion, and deletion
operations.

2. Order (M): Each node in a B-tree can have at most MMM children and at least ⌈M2⌉\lceil \
frac{M}{2} \rceil⌈2M⌉ children (except the root).

3. Key Distribution: Keys are distributed across all nodes, both internal and leaf nodes.

4. Minimum Occupancy: Every node except the root must be at least half full.

5. Sorted Keys: Keys in each node are sorted in increasing order.

6. Leaf Level: All leaves are at the same level (balanced structure).

7. Search Efficiency: Searches can be performed in O(log⁡n)O(\log n)O(logn), where nnn is the
number of keys.

B+ Tree Properties:

1. Balanced Tree Structure: Similar to B-trees, B+ trees are balanced, ensuring logarithmic
depth.

2. Internal and Leaf Nodes: Internal nodes contain keys only, while leaf nodes contain the
actual data (or pointers to data).

3. Linked Leaves: Leaf nodes are linked together to form a sorted linked list for efficient
sequential access.

4. Efficient Range Queries: Since all data resides in leaf nodes linked together, range queries
and sequential access can be done efficiently.

least ⌈M2⌉\lceil \frac{M}{2} \rceil⌈2M⌉ children (except the root).

5. Order (M): Similar to B-trees, internal nodes in B+ trees have at most MMM children and at

6. Full Utilization of Nodes: Non-leaf nodes are used solely for guiding searches, leading to
better space utilization in the leaf nodes.

7. Search Optimization: Searching is done in internal nodes, and once the leaf node is reached,
the exact key can be found.

B+ Tree Indexing:
B+ trees are extensively used in database indexing because of their efficient storage structure and
ability to handle a large number of records. Here's how they help in indexing:

1. Efficient Search: The logarithmic height of the B+ tree ensures fast search operations.

2. Sorted Leaf Nodes: Leaf nodes in B+ trees are linked, making range queries efficient and
sequential access easier.

3. Minimized Disk Access: As B+ trees minimize disk I/O by clustering similar records together
and keeping internal nodes small, they reduce the number of disk accesses for queries.

4. Better Space Utilization: By storing data only in leaf nodes, internal nodes stay small, which
makes indexing more efficient.

5. Support for Range Queries: Since all records are stored at the leaf level in sorted order and
are linked, B+ trees allow efficient range searches (e.g., finding all records between two
keys).

Dbms
No ratings yet
Dbms
16 pages
Unit - 5 DBMS
No ratings yet
Unit - 5 DBMS
69 pages
Unit - 4
No ratings yet
Unit - 4
42 pages
Screenshot 2025-03-12 at 9.41.04 AM
No ratings yet
Screenshot 2025-03-12 at 9.41.04 AM
41 pages
Co2 - Index in DBMS 1
No ratings yet
Co2 - Index in DBMS 1
29 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
12 pages
Indexing
No ratings yet
Indexing
2 pages
Index Structures
No ratings yet
Index Structures
34 pages
File Organization and Indexing
No ratings yet
File Organization and Indexing
13 pages
Lecture-13 Indexing and Its Types: Subject: DBMS Subject Code: BCA-S301T Faculty: Saurabh Jha
No ratings yet
Lecture-13 Indexing and Its Types: Subject: DBMS Subject Code: BCA-S301T Faculty: Saurabh Jha
16 pages
DBMS Seminar
No ratings yet
DBMS Seminar
14 pages
CO3-Session-09 & 10
No ratings yet
CO3-Session-09 & 10
41 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
33 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Indexing Lecture Nov 2023 Summary
No ratings yet
Indexing Lecture Nov 2023 Summary
41 pages
SQL Indexes 2
No ratings yet
SQL Indexes 2
10 pages
Co3 Session 21
No ratings yet
Co3 Session 21
53 pages
CIT 401 Lecture Note
No ratings yet
CIT 401 Lecture Note
46 pages
DBMS Seminar
No ratings yet
DBMS Seminar
12 pages
DBMS Unit9
No ratings yet
DBMS Unit9
44 pages
Module 4 Indexing
No ratings yet
Module 4 Indexing
20 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
7 pages
What Is Indexing?: Indexing Is A Data Structure Technique Which Allows You To Quickly Retrieve
100% (1)
What Is Indexing?: Indexing Is A Data Structure Technique Which Allows You To Quickly Retrieve
7 pages
SS3 Term 1
No ratings yet
SS3 Term 1
18 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
6 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
5 pages
Indexing Lecture Nov 2023 Detailed
No ratings yet
Indexing Lecture Nov 2023 Detailed
37 pages
Dbms Mod3
No ratings yet
Dbms Mod3
54 pages
Indexing and Hashing: Basic Concept, Ordered Indices: Adbms
No ratings yet
Indexing and Hashing: Basic Concept, Ordered Indices: Adbms
22 pages
Introduction To Indexing in Database Management Systems Print
No ratings yet
Introduction To Indexing in Database Management Systems Print
12 pages
Link
No ratings yet
Link
4 pages
Indexes
No ratings yet
Indexes
4 pages
Index and Hashing 2017 Combined
No ratings yet
Index and Hashing 2017 Combined
60 pages
Indexing Structures For Files
No ratings yet
Indexing Structures For Files
23 pages
M12 Indexing in DBMS
No ratings yet
M12 Indexing in DBMS
18 pages
Indexing
No ratings yet
Indexing
6 pages
Indexes
No ratings yet
Indexes
70 pages
Database Index PDF
No ratings yet
Database Index PDF
6 pages
Lesson 4 - Indexing
No ratings yet
Lesson 4 - Indexing
6 pages
Indexing - II
No ratings yet
Indexing - II
57 pages
Index Architecture: Febriliyan Samopa
No ratings yet
Index Architecture: Febriliyan Samopa
110 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
23 pages
CO3 Notes Indexing
No ratings yet
CO3 Notes Indexing
11 pages
Indexing - DBMS
No ratings yet
Indexing - DBMS
20 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
15 pages
Types of Indexes
No ratings yet
Types of Indexes
9 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
5 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
4 pages
Indexing in DBMSPDF
No ratings yet
Indexing in DBMSPDF
4 pages
S - UNIT VII Indexing in Database
No ratings yet
S - UNIT VII Indexing in Database
9 pages
Indexing
No ratings yet
Indexing
6 pages
CMP 312
No ratings yet
CMP 312
2 pages
Indexing
No ratings yet
Indexing
10 pages
Unit 3 Storage Strategies Indices B-Trees Hashing
No ratings yet
Unit 3 Storage Strategies Indices B-Trees Hashing
12 pages
R22 Unit 5
No ratings yet
R22 Unit 5
23 pages
Indexing
No ratings yet
Indexing
8 pages
What Is An Index
No ratings yet
What Is An Index
4 pages
Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked
From Everand
Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
1/5 (1)
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Primary Indexing

Uploaded by

Primary Indexing

Uploaded by

Primary Indexing

Advantages of Primary Indexing:

 Faster search times when querying by the primary key.

 Index file is much smaller than the original data file.

 Only useful when searching based on the primary key.

A secondary index is created on non-primary attributes or non-sorted data. It helps to speed up

1. Dense Secondary Index:

2. Sparse Secondary Index:

Advantages of Secondary Indexing:

 Allows fast access to records based on non-primary attributes.

Cluster Indexing in DBMS

Key Concepts of Cluster Indexing

o A cluster index is based on a clustering key, which may be a single column or a

3. Sparse Cluster Index:

4. Dense Cluster Index:

Example of Cluster Indexing

Logical Representation of Data After Cluster Indexing:

Student_ID Department Name Age

1 Computer Science Alice 22

2 Computer Science Bob 21

3 Mechanical Engineering Charlie 23

4 Mechanical Engineering David 24

1. Reduces Data Access Time:

2. Efficient for Join Operations Disadvantages of Cluster Indexing

3. Inefficient for Small Queries:

Multilevel Indexing in DBMS

How Multilevel Indexing Works

Benefits of Multilevel Indexing

 Efficient for Large Datasets

 Higher Maintenance Cost

5. Sorted Keys: Keys in each node are sorted in increasing order.

least ⌈M2⌉\lceil \frac{M}{2} \rceil⌈2M⌉ children (except the root).

You might also like