Indexing

The document discusses indexing and hashing techniques for efficient data representation and retrieval. It covers various types of indices, including ordered and hash indices, as well as concepts like dense and sparse indices, multi-level indices, and B+ tree structures. Additionally, it addresses hashing methods, including static and dynamic hashing, to optimize data access and manage database growth.

Uploaded by

Muhammad Aamir Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views24 pages

Indexing

Uploaded by

Muhammad Aamir Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 24

Indexing and Hashing

B.Ramamurthy
Chapter 11

01/29/25 B.Ramamurthy 1
Representing Data
Attributes are represented in fixed or variable
length collections called “fields”
Fields in turn are put into fixed or variable length
collections called records.
Records are stored in physical blocks.
A collection of records that forms a relation is
stored as a collection of blocks called a file.
 This file different than OS file. How?

 Organization is different.

 Extra indices to accommodate easy search and

access.

01/29/25 B.Ramamurthy 2
Basic Concepts (indexing)
Indexing works the same way as a
catalog for a book in a library.
Indexing needs to be efficient to
allow fast access to records.
Two types of indices:
 ordered indices and
 hash indices

01/29/25 B.Ramamurthy 3
Techniques and Evaluation
Access types : types of accesses that are
supported efficiently. Search by specific value
or by range.
Access time: Time sit takes to find a particular
data or a set of data.
Insertion time: Time it takes to insert a new
item.
Deletion time: Time it takes to delete an item.
Space overhead : Additional space occupied by
the index structure.

01/29/25 B.Ramamurthy 4
Ordered Indices
To gain fast access to records in a file
we can use an index structure.
If the file containing the records is
sequentially ordered, the index whose
search key specifies the sequential
order of the file is the primary key
index.
Primary key indices are also called
clustering indices.

01/29/25 B.Ramamurthy 5
Primary Index
Assume that all files are ordered
sequentially on some search key.
Such files, with primary key on the
search key, are called index-
sequential files.
These files accommodate both
sequential and random access to
individual records.

01/29/25 B.Ramamurthy 6
Dense and Sparse Index
Dense index:
 An index record appears for every search
key value in the file.
 The index record contains the search key
and a pointer to the first data record with
that search-key value.
Sparse index:
 An index is created only for a few values.
Each index contains a value and pointer to
first record that contains that value.

01/29/25 B.Ramamurthy 7
Dense Index
Brighton Brighton A-217 750
Downtown Downtown A-101 500
Mianus Downtown A-110 600
Perryridge Mianus A-215
Redwood 700
Perryridge A-102 400
Round Hill Perryridge A-201 900
Perryridge A-218 700
Redwood A-222 700
Round Hill A-305 350

01/29/25 B.Ramamurthy 8
Sparse Index
Brighton Brighton A-217 750
Mianus Downtown A-101 500
Redwood Downtown A-110 600
Mianus A-215
700
Perryridge A-102 400
Perryridge A-201 900
Perryridge A-218 700
Redwood A-222 700
Round Hill A-305 350

Which one is better? Dense or sparse? It is a trade off

Between access time and space overhead.

01/29/25 B.Ramamurthy 9
Multi-level Indices
Indices themselves may become too large for
efficient processing.
Example:
 Consider file with 100000 records with 10 records in a
block.
 With sparse index and one index per block we have
about 10,000 indices.
 Assuming 100 indices fit into a block we need about
100 blocks.
 It is desirable to keep the index file in the main
memory.
 Problem: Searching a large index file becomes
expensive.

01/29/25 B.Ramamurthy 10
Multi-level Index
Solution: Index the index file. We
treat the index as we would treat any
other sequential file and construct a
sparse index on the primary index.
We binary-search the outer level
index to find the largest search key
less than or equal to the one we
desire.
Two-level sparse index ; Figure 11.4

01/29/25 B.Ramamurthy 11
Secondary Index
Secondary index is on attributes whose
values are not stored sequentially.
If the search key of a secondary index
is not a candidate key, the index needs
to be dense too.
We can use an extra level of indirection
with buckets at the second level.
See fig.11.5

01/29/25 B.Ramamurthy 12
Secondary Index
350 Brighton A-217 750
400 Downtown A-101 500
500 Downtown A-110 600
600 Mianus A-215
700
Perryridge A-102 400
700
750 Perryridge A-201 900
900 Perryridge A-218 700
Redwood A-222 700
Round Hill A-305 350

01/29/25 B.Ramamurthy 13
B+ Tree Index Files
Main disadvantage of the index-
sequential file organization is that
performance degrades as the file grows
both for index lookups and sequential
scans.
B+ tree index structure is most widely
used of several index structures that
maintain their efficiency despite
insertion and deletion of data.

01/29/25 B.Ramamurthy 14
B+ Tree Index files
A B+ index tree is a balanced tree
in which every path from root to
leaf is of same length and each
non-leaf node has between
ceiling(n/2) and n nodes where n is
fixed.
Typical node is a B+ tree:
n-1 search keys K1, K2,… Kn-1
n pointers P1, P2, …Pn
01/29/25 B.Ramamurthy 15
B+ Tree Node

P1 K1 P2 K2 …… Pn-1 Kn-1 Pn

01/29/25 B.Ramamurthy 16
B+ Tree (contd.)
Structure of a B+ tree
Queries on B+ trees
Updates on B+ trees (insertion ,
deletion)
B+ file organization
B Tree variation of B+ tree :
avoiding redundancy

01/29/25 B.Ramamurthy 17
Hashing
Can we avoid the IO operations that the
result from accessing the index file?
Hashing offers a way.
It also provides a way of constructing
indices (which need nor be sequential).
We will study static and dynamic
hashing.

01/29/25 B.Ramamurthy 18
Hash File Organization
Address of the disk block containing a
desired record is computed using a
function (hash function) and the search
key.
Let K denote set of all search keys, B
denote set of all bucket addresses. Hash
function h is a function that maps K to
B.
Bucket is typically a disk block.

01/29/25 B.Ramamurthy 19
Operations
To insert a record with Ki as key, compute
h(Ki) which gives the address of the bucket
for the record. If there is space in the
bucket then it is stored that bucket. (else
chaining?)
To lookup a record with key Ki, compute
h(ki). Check with every record in the
bucket to obtain the record.
To delete a similar hash, find and delete is
followed.
01/29/25 B.Ramamurthy 20
Hash Functions
Hash function should be chosen so that
 The distribution of records is uniform.
 The distribution is random.
Handling bucket overflows:
 May occur due to insufficient number of
buckets.
 Due to bucket skew.
 Solution: Overflow buckets, chaining, double
hashing, linear probing, quadratic probing

01/29/25 B.Ramamurthy 21
Hash Indices
Hashing can be used for organizing
indices.Hash index organizes
search keys with their associated
pointers.
See Fig.11.22
Typically only secondary indices
need to be organized using
hashing.
01/29/25 B.Ramamurthy 22
Dynamic Hashing
Many of today’s databases grow very large
in (a short) time.
If you use static hash function we have
three option:
 Choose hash function based on current size,
 Choose hash function based on anticipated
size.
 Periodically restructure the hash file in
response to growth.
Another solution: dynamic hashing.
01/29/25 B.Ramamurthy 23
Dynamic Hash Techniques
Dynamic hash techniques allow the
hash function to be modified
dynamically to accommodate the
growth and shrinkage of the database.
It is also known as extendable hashing.
Extendable hashing copes with the
growth in the database size by splitting
and coalescing buckets as the database
grows and shrinks.

01/29/25 B.Ramamurthy 24

Indexing Hashing Files
No ratings yet
Indexing Hashing Files
68 pages
IT3020 L06 Indexing
No ratings yet
IT3020 L06 Indexing
41 pages
03 UW Indexing
No ratings yet
03 UW Indexing
97 pages
CS2202 IndexingHashing
No ratings yet
CS2202 IndexingHashing
83 pages
04 UW Hashing
No ratings yet
04 UW Hashing
79 pages
DBMS Indexing
No ratings yet
DBMS Indexing
43 pages
CO3-Session-09 & 10
No ratings yet
CO3-Session-09 & 10
41 pages
CH 12 Updated
No ratings yet
CH 12 Updated
55 pages
Lec20Indexing v1
No ratings yet
Lec20Indexing v1
57 pages
Index Structures
No ratings yet
Index Structures
34 pages
File Organization
No ratings yet
File Organization
41 pages
Unit Iv
No ratings yet
Unit Iv
29 pages
ch12 4
No ratings yet
ch12 4
19 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
DBMS Unit9
No ratings yet
DBMS Unit9
44 pages
Chapter 3 File Organization Indexed Methods
No ratings yet
Chapter 3 File Organization Indexed Methods
31 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
84 pages
Index and Hashing 2017 Combined
No ratings yet
Index and Hashing 2017 Combined
60 pages
Ch14, Veiws, Normalization - Summary
No ratings yet
Ch14, Veiws, Normalization - Summary
68 pages
L4 Indexing
No ratings yet
L4 Indexing
56 pages
Index Architecture: Febriliyan Samopa
No ratings yet
Index Architecture: Febriliyan Samopa
110 pages
File Organization-Lec11
No ratings yet
File Organization-Lec11
15 pages
Module 4 Indexing
No ratings yet
Module 4 Indexing
20 pages
Indexing - II
No ratings yet
Indexing - II
57 pages
11.2 Indexing
No ratings yet
11.2 Indexing
26 pages
File Organization and Indexing
No ratings yet
File Organization and Indexing
38 pages
CSE 301 Lecture-8-Indexing WT
No ratings yet
CSE 301 Lecture-8-Indexing WT
31 pages
DBMS Unit5
No ratings yet
DBMS Unit5
40 pages
Indexing and Hashing: Basic Concept, Ordered Indices: Adbms
No ratings yet
Indexing and Hashing: Basic Concept, Ordered Indices: Adbms
22 pages
Co3 Session 21
No ratings yet
Co3 Session 21
53 pages
Aplikasi DB-MKG 7
No ratings yet
Aplikasi DB-MKG 7
22 pages
DBMS Indexing Methods
No ratings yet
DBMS Indexing Methods
33 pages
UNIT-5: Indexing and Hashing
No ratings yet
UNIT-5: Indexing and Hashing
78 pages
Indexes
No ratings yet
Indexes
70 pages
Indexing
No ratings yet
Indexing
11 pages
INDEXING
No ratings yet
INDEXING
10 pages
Unit-3 Part 2 Indexing and Hashing
No ratings yet
Unit-3 Part 2 Indexing and Hashing
36 pages
7 Indexing
No ratings yet
7 Indexing
13 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
84 pages
Index and Hashing
No ratings yet
Index and Hashing
82 pages
Indexing - DBMS
No ratings yet
Indexing - DBMS
20 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
CIS552 Indexing and Hashing 1
No ratings yet
CIS552 Indexing and Hashing 1
56 pages
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
No ratings yet
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
7 pages
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
No ratings yet
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
44 pages
Indexing: Contents
No ratings yet
Indexing: Contents
13 pages
Module Iippt
No ratings yet
Module Iippt
27 pages
Indexing and Hashing
No ratings yet
Indexing and Hashing
20 pages
Indexing in Database
No ratings yet
Indexing in Database
33 pages
Unit 3 Storage Strategies Indices B-Trees Hashing
No ratings yet
Unit 3 Storage Strategies Indices B-Trees Hashing
12 pages
02 Blocking - Addional
No ratings yet
02 Blocking - Addional
74 pages
Indexing
No ratings yet
Indexing
8 pages
Southern Province Grade 10 Information and Communication Technology Ict 2020 1 Term Test Paper 61e9422335b6f
No ratings yet
Southern Province Grade 10 Information and Communication Technology Ict 2020 1 Term Test Paper 61e9422335b6f
13 pages
Memoryhierarchy Indexing
No ratings yet
Memoryhierarchy Indexing
9 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
Indexing and Hashing: B.Ramamurthy
No ratings yet
Indexing and Hashing: B.Ramamurthy
24 pages
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
No ratings yet
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
20 pages
Chapter 11: Indexing and Hashing
No ratings yet
Chapter 11: Indexing and Hashing
47 pages
Unit Iv Indexing and Hashing: Basic Concepts
No ratings yet
Unit Iv Indexing and Hashing: Basic Concepts
35 pages
Indexing Files: Last Time
No ratings yet
Indexing Files: Last Time
5 pages
SOLAR POWER BANK Final Report Submission
100% (1)
SOLAR POWER BANK Final Report Submission
22 pages
Prepare, Sterilize and Dispense Culture Media
No ratings yet
Prepare, Sterilize and Dispense Culture Media
24 pages
Muhammad Naseem Electrical Supervisor CV
No ratings yet
Muhammad Naseem Electrical Supervisor CV
3 pages
Generating Evidence For Artificial Intelligence-Based Medical Devices
No ratings yet
Generating Evidence For Artificial Intelligence-Based Medical Devices
104 pages
Yamada Diaphragm Pump 80 Series Manual
No ratings yet
Yamada Diaphragm Pump 80 Series Manual
18 pages
Script Tlsfrance
No ratings yet
Script Tlsfrance
13 pages
An Analysis of QSAR Research Based On Machine Learning Concepts
No ratings yet
An Analysis of QSAR Research Based On Machine Learning Concepts
15 pages
Samsung DCS Hotel Operator System Administration Guide
No ratings yet
Samsung DCS Hotel Operator System Administration Guide
19 pages
Unit 1 - ADT
No ratings yet
Unit 1 - ADT
26 pages
Performance Task in STS
No ratings yet
Performance Task in STS
3 pages
Bib Sepport System
No ratings yet
Bib Sepport System
17 pages
HUAWEI FLA-LX3 9.1.0.116 (C605E5R1P1) Release Notes
No ratings yet
HUAWEI FLA-LX3 9.1.0.116 (C605E5R1P1) Release Notes
10 pages
Clips Report-CAM - 6-2023-10-13-1407
No ratings yet
Clips Report-CAM - 6-2023-10-13-1407
2 pages
Class VII Exam Paper-1
100% (1)
Class VII Exam Paper-1
3 pages
Sales Analysis and Prediction Using Pyth
No ratings yet
Sales Analysis and Prediction Using Pyth
5 pages
RHLS User Guidelines PDF
No ratings yet
RHLS User Guidelines PDF
50 pages
The Derivative As The Slope of The Tangent Line
No ratings yet
The Derivative As The Slope of The Tangent Line
5 pages
Assessment User Experience Responsive Web Applications Case Study
No ratings yet
Assessment User Experience Responsive Web Applications Case Study
8 pages
MCT Enrollment and Renewal Guide Feb 2021 - General MCT Trainers
No ratings yet
MCT Enrollment and Renewal Guide Feb 2021 - General MCT Trainers
22 pages
Career Transition Handbook
No ratings yet
Career Transition Handbook
8 pages
ImageFlow 1
No ratings yet
ImageFlow 1
9 pages
K Agitation
No ratings yet
K Agitation
6 pages
Integration-And System Testing: O O S C
No ratings yet
Integration-And System Testing: O O S C
32 pages
Microland Limited
No ratings yet
Microland Limited
3 pages
Ramesh 02 Mar 2025
No ratings yet
Ramesh 02 Mar 2025
1 page
BIM Project Delivery Waste
No ratings yet
BIM Project Delivery Waste
6 pages
Awr Design Environment University Program (Flexible Access) Installation Instructions
No ratings yet
Awr Design Environment University Program (Flexible Access) Installation Instructions
2 pages
Sata SSD 2.5 Inch
No ratings yet
Sata SSD 2.5 Inch
2 pages
Install build IOAPI 3.2 昏眼看日新浪博客
No ratings yet
Install build IOAPI 3.2 昏眼看日新浪博客
3 pages
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Indexing

Uploaded by

Indexing

Uploaded by

Indexing and Hashing

 Extra indices to accommodate easy search and

Which one is better? Dense or sparse? It is a trade off

You might also like