Indexing Lecture Nov 2023 Summary
Indexing Lecture Nov 2023 Summary
Dr David Hamill
Physical storage of data
1. Indexes are used to retrieve data from the database very fast.
2. Indexes work just like an index in a book
3. The users cannot see the indexes, they are just used to speed up
searches/queries.
4. A primary key is an automatic index.
Note: Updating a table with indexes takes more time than updating a table
without (because the indexes also need an update). So, only create indexes on
fields that will be frequently searched against.
Overview
Introduction
Multi-Level Indexes
Distinct values
The index file requires significantly fewer blocks
than the data file
• Sparse index
• Index file record typically smaller in size than data file record
• Not only do we have to move records in the data file we also have to
change some index entries
Clustering
Indexes A clustered index is faster. A non-clustered index is
slower. The clustered index requires less memory for
operations. A non-Clustered index requires more memory
for operations.
Indexes
are fixed length consisting Second field is a
of two fields. pointer to a disk
block
Storage overhead is
not typically a serious
problem.
Secondary (Non-Clustered) Indexes
1. Difference 1: Only one clustered index per table. You can create
multiple non-clustered indexes in a single table
2. Difference 2: Clustered indexes only sort tables. Therefore, they do
not consume extra storage. Non-clustered indexes are stored in a
separate place from the actual table claiming more storage space.
3. Difference 3: Clustered indexes are faster than Non-clustered indexes
since they don’t involve any extra lookup step.
When an index file becomes large
and extends over many pages, the
search time for the required index
increases
Multi-Level
Indexes A multi-level index Treat the index like
any other file
attempts to overcome Split the index into a
this problem by number of smaller
indexes
reducing the search Maintain an index to
range the indexes
Multi-Level
Indexes
• Multilevel indexes refer to a
hierarchical structure of indexes.
Clustered
NonClustered
Clustered Index
• A clustered index defines the order in which data is physically stored in a table. Table data
can be sorted in only way, therefore, there can be only one clustered index per table. In SQL
Server, the primary key constraint automatically creates a clustered index on that particular
column.
• The only time the data rows in a table are stored in sorted order is when the table
contains a clustered index. When a table has a clustered index, the table is called a
clustered table. If a table has no clustered index, its data rows are stored in an
unordered structure called a heap.
• A non-clustered index doesn’t sort the physical data inside the table. In fact, a non-clustered index is
stored at one place and table data is stored in another place. This is similar to a textbook where the
book content is located in one place and the index is located in another. This allows for more than one
non-clustered index per table
• The pointer from an index row in a nonclustered index to a data row is called a row locator. The
structure of the row locator depends on whether the data pages are stored in a heap or a clustered
table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the
clustered index key.
• You can add nonkey columns to the leaf level of the nonclustered index to by-pass existing index key
limits, and execute fully covered, indexed, queries. Both clustered and nonclustered indexes can be
unique.
Throughput(queries/sec)
60
50
40
30
20
10
0
B-T re e hash inde x
16/11/2023
B+-Trees Special type of tree structure used for search purposes 27
Root Node
Child
Level 0 Node A Internal Node
Level 1 B C
Level 2 D E F
16/11/2023
Leaf Node
B+-Trees 28
B-Trees
◦ Invented in 1969, B-trees are still the prevailing data
structure for indexes in relational databases
◦ A search tree with some additional constraints on it.
◦ These constraints ensure that the tree is always
balanced and that the space wasted by deletion (if any)
never becomes excessive.
B+-Trees
◦ Most implementations of dynamic multilevel index use
a variation of the B-tree data structure called a B+-Tree 16/11/2023
Troubleshooting Techniques 29
16/11/2023
Table tuning – Indexing 30
16/11/2023
Table tuning – Indexing 31
16/11/2023
B-Tree example Youtube 32
• http://www.youtube.com/watch?v=coRJrcIYbF4
16/11/2023
Query type dictates the
best Index
Point Query
This is a query that will return at least one record due to a where
condition
Eg:
Select * from staff where StaffID = ‘12345’
Multi-Point Query
Eg:
SELECT * FROM EMPLOYEES
WHERE DEPARTMENT = ‘Human Resources’
A Range Query
This type of query will return a set of values within an interval or half-interval
Eg:
SELECT * FROM EMPLOYEE
WHERE AGE >=50 AND <70
In this scenario a Prefix match query is where only the first part of the attribute or sequence of attributes is
specified.
Eg:
SELECT FIRSTNAME, SURNAME FROM EMPLOYEES
WHERE SURNAME LIKE ‘ST%’
This query will return all records with surnames staring with the letters ‘ST’.
In this example it would be obvious to index the Surname field, if this type of
query is to be run repeatedly.
The End