0% found this document useful (0 votes)
3 views5 pages

DBMS Unit-5

Indexing in DBMS enhances database performance by reducing disk access during queries through various index structures like primary, dense, sparse, clustering, and secondary indexes. Primary indexing uses unique keys for efficient searching, while clustering groups records by non-unique keys. Hashing provides a direct address for data records, optimizing search efficiency without relying on index structures.

Uploaded by

priw.exams
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

DBMS Unit-5

Indexing in DBMS enhances database performance by reducing disk access during queries through various index structures like primary, dense, sparse, clustering, and secondary indexes. Primary indexing uses unique keys for efficient searching, while clustering groups records by non-unique keys. Hashing provides a direct address for data records, optimizing search efficiency without relying on index structures.

Uploaded by

priw.exams
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Indexing in DBMS:

1. Indexing is used to optimize the performance of a database by minimizing the


number of disk accesses required when a query is processed.
2. The index is a type of data structure. It is used to locate and access the data in a
database table quickly.

Index structure: Indexes can be created using some database columns.

1. The first column of the database is the search key that contains a copy of the
primary key or candidate key of the table. The values of the primary key are stored
in sorted order so that the corresponding data can be accessed easily.

2. The second column of the database is the data reference. It contains a set of
pointers holding the address of the disk block where the value of the particular key
can be found.

Ordered indices: The indices are usually sorted to make searching faster. The indices which
are sorted are known as ordered indices. Suppose we have an employee table with thousands of
record and each of which is 10 bytes long. If their IDs start with1,2,3. and soon and we have to
search student with
ID-543.

1. In the case of a database with no index, we have to search the disk block from
starting till it reaches 543. The DBMS will read the record after reading
543*10=5430 bytes.
2. In the case of an index, we will search using indexes and the DBMS will read the
record after reading 542*2= 1084 bytes which are very less compared to the
previous case.
Primary Index:
1. If the index is created on the basis of the primary key of the table, then it is known
as primary indexing. These primary keys are unique to each record and contain 1:1
relation between the records.
2. As primary keys are stored in sorted order, the performance of the searching
operation is quite efficient.
3. The primary index can be classified into two types: Dense index and Sparse index.
Dense index: The dense index contains an index record for every search key value in the data
file. It makes searching faster.
1. In this, the number of records in the index table is same as the number of records
in the main table.
2. It needs more space to store index record itself. The index records have the search
key and a pointer to the actual record on the disk.

Sparse index
1. In the data file, index record appears only for a few items. Each item points to a block.

2. In this, instead of pointing to each record in the main table, the index points to the
records in the main table in a gap.

Clustering Index:
1. A clustered index can be defined as an ordered data file. Sometimes the index is
created on non-primary key columns which may not be unique for each record.
2. In this case, to identify the record faster, we will group two or more columns to get
the unique value and create index out of them. This method is called a clustering
index.
3. The records which have similar characteristics are grouped, and indexes are
created for these group.
Suppose a company contains several employees in each department. We use a
clustering index, where all employees which belong to the same Dept_IDare considered
within a single cluster, and index pointers point to the cluster as a whole. Here Dept_Id is
a non-unique key.

The previous schema is little confusing because one disk block is shared by records
which belong to the different cluster. If we use separate disk block for separate clusters,
then it is called better technique.
Secondary Index: In the sparse indexing, as the size of the table grows, the size of
mapping also grows. These mappings are usually kept in the primary memory so that address
fetch should be faster. Then the secondary memory searches the actual data based on the address
got from mapping. If the mapping size grows then fetching the address itself becomes slower. In
this case, the sparse index will not be efficient. To overcome this problem, secondary indexing is
introduced.

In secondary indexing, to reduce the size of mapping, another level of indexing is


introduced. In this method, the huge range for the columns is selected initially so that the
mapping size of the first level becomes small. Then each range is further divided into smaller
ranges. The mapping of the first level is stored in the primary memory, so that address fetch is
faster. The mapping of the second level and actual data are stored in the secondary memory (hard
disk).

1. If you want tofindtherecordofroll111 in the diagram, then it will search the highest
entry which is smaller than or equal to 111 in the first level index. It will get 100
at this level.
2. Then in the second index level, again it does max (111) <= 111 and gets 110. Now
using the address 110, it goes to the data block and starts searching each record till
it gets 111.
3. This is how a search is performed in this method. Inserting, updating or deleting is
also done in the same manner.
Hashing: In a huge database structure, it is very inefficient to search all the index values
and reach the desired data. Hashing technique is used to calculate the direct location of a data
record on the disk without using index structure. In this technique, data is stored at the data
blocks whose address is generated by using the hashing function. The memory location where
these records are stored is known as data bucket or data blocks.

In this, a hash function can choose any of the column value to generate the address. Most of
the time, the hash function uses the primary key to generate the address of the data block. A hash
function is a simple mathematical function to any complex mathematical function. We can even
consider the primary key itself as the address of the data block. That means each row whose

address will be the same as a primary key stored in the data block.

The above diagram shows data block addresses same as primary key value. This
hash function can also be a simple mathematical function like exponential, mod, cos, sin,
etc. Suppose we have mod (5) hash function to determine the address of the data block. In
this case, it applies mod (5) hash function on the primary keys and generates 3, 3, 1, 4 and
2 respectively, and records are stored in those data block addresses.

You might also like