INDEXING AND QUERY OPTIMIZATION
Concept of indexing and performance improvement
CONCEPT OF INDEXING
What is Indexing?
Indexing is a technique used in database systems to enhance the speed of data retrieval
operations without scanning every row of the table. It is similar to an index in a book used to
find the information quickly.
How Indexing Works
An index creates a separate data structure (typically a B-tree or a hash map) that stores key
values from one or more columns and pointers (references) to the actual rows in the table.
Purpose of Indexing
Speed up searching and retrieval
Optimize WHERE, JOIN, ORDER BY, and GROUP BY clauses
Reduce disk I/O operations
PERFORMANCE IMPROVEMENT THROUGH INDEXING
Key Benefits
Fast Lookups - Reduces need to scan the entire table by enabling binary-like search
Efficient Sorting - Indexes help efficiently execute ORDER BY
Improved Join Performance - Indexes on foreign keys help speed up join queries
Faster Filtering - Enhances query response time when filtering with WHERE
TYPES OF INDEXES
There are few types of indexes namely, primary, secondary clustered and non-clustered.
Let's look at each one of them.
PRIMARY INDEX
they are created automatically when a table is defined with a primary key.
They are Unique and usually clustered.
Characteristics:
One per table.
Based on the primary key field.
Entries in the data file are physically ordered based on the primary index.
Example:
CREATE TABLE Employee (
emp_id INT PRIMARY KEY,
name VARCHAR(50),
department VARCHAR(20)
);
SECONDARY INDEX – they are created on non-primary key attributes.
They are essential because they Helps in speeding up queries on non-primary columns.
Characteristics:
Can be multiple per table.
Does not affect physical order of data.
Requires additional pointers to rows.
Example:
CREATE INDEX dept_index ON Employee(department);
Note: they are best for Queries involving frequently searched columns that are not primary
keys.
CLUSTERED INDEX
They Sorts and stores the data rows in the table physically based on the key values of the index.
Characteristics:
Only one clustered index per table.
Table is restructured to follow the index order.
Data rows are the index.
Example :
CREATE CLUSTERED INDEX idx_emp_name ON Employee(name);
Note: they are Best For:
Range-based queries.
When the query benefits from sequential access (e.g., date ranges, alphabetical names).
NON-CLUSTERED INDEX
They Creates a separate structure from the table data with pointers to the actual data rows.
Characteristics:
Can have multiple non-clustered indexes per table.
Index is not sorted with the actual data.
Uses an index key and a row locator.
Example:
CREATE NONCLUSTERED INDEX idx_dept ON Employee(department);
Note : they are Best For
When you want to search data using multiple fields.
Queries that don’t rely on the physical order of data.
NB
Use clustered index for range-based retrievals.
Use non-clustered indexes for frequently queried columns.
Avoid excessive indexing—balance between read and write performance.
Always analyze your query execution plans using tools like:
EXPLAIN (MySQL, PostgreSQL)
Query Analyzer (SQL Server)
EXPLAIN PLAN (Oracle)