0% found this document useful (0 votes)
8 views27 pages

093.indexes Part3

The document provides an overview of SQL Server's data storage architecture, detailing the types of indexes (clustered and non-clustered), the structure of pages and extents, and the organization of tables and partitions. It explains the role of Index Allocation Mapping (IAM) in managing storage and the differences between heap and clustered tables. Additionally, it describes the structure and function of clustered and non-clustered indexes, including how data retrieval works in each case.

Uploaded by

prapulkumard
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views27 pages

093.indexes Part3

The document provides an overview of SQL Server's data storage architecture, detailing the types of indexes (clustered and non-clustered), the structure of pages and extents, and the organization of tables and partitions. It explains the role of Index Allocation Mapping (IAM) in managing storage and the differences between heap and clustered tables. Additionally, it describes the structure and function of clustered and non-clustered indexes, including how data retrieval works in each case.

Uploaded by

prapulkumard
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Follow us on :

Indexes

Indexes are broadly classified into two types:


Clustered and Non- clustered

A table with clustered index is called as


Clustered table

A table which doesn’t have a clustered index is


called as heap.

Follow us on :
Page
 In SQL Server, the fundamental unit of data storage is the page.

 The disk space allocated to a database file (.mdf or .ndf) is


logically divided into consecutively numbered pages, starting
from 0 o n.

 All disk I/O operations occur at the page level, meaning SQL
Server reads and writes entire pages at a time.

 To efficiently manage storage, pages are grouped into extents,


which consist of eight physically contiguous pages. Every page
in the database is stored within an extent.
Follow us on :
Page
 In SQL Server, each page has a fixed size of 8 KB, resulting
in 128 pages per megabyte of database storage.

 Every page contains a 96-byte header that holds essential


system details, such as the page number, page type,
available free space, and the allocation unit ID of the
object that owns the page.

Follow us on :
Extent

 In SQL Server, extents serve as the fundamental unit for


space management. Each extent consists of eight
physically contiguous pages, totaling 64 KB, which means
there are 16 extents per megabyte in a database.

 To optimize space allocation, SQL Server does not assign


entire extents to tables with small amounts of data.

Follow us on :
Extent
 Instead, it utilizes two types of extents:

1. Uniform Extents – Fully owned by a single object,


meaning all eight pages within the extent are dedicated
to that object.

2. Mixed Extents – Shared among up to eight objects,


allowing each page within the extent to be assigned to
a different object.
Follow us on :
Extent
 A new table or index is generally allocated pages from
mixed extents.

 When the table or index grows to the point that it has


eight pages, it then switches to use uniform extents for
subsequent allocations.

 If you create an index on an existing table that has


enough rows to generate eight pages in the index, all
allocations to the index are in uniform extents.
Follow us on :
Extent

Follow us on :
Table Organization
 A table consists of one or more
partitions, with each partition
storing data rows in either a heap
or a clustered index structure.

 The pages within a heap or


clustered index are organized into
one or more allocation units, based
on the data types of the columns in
the table.
Follow us on :
Partitions

 Table and index pages are contained in one or more


partitions.

 By default, a table or index has only one partition that


contains all the table or index pages. The partition
resides in a single filegroup.

Follow us on :
Database Files
SQL Server databases have three types of files:

1. Primary data files


The primary data file is the starting point of the database and points
to the other files in the database. Every database has one primary
data file. The recommended file name extension for primary data
files is .mdf.

Follow us on :
Database Files
2. Secondary data files
Secondary data files make up all the data files, other than the
primary data file. Some databases may not have any secondary data
files, while others have several secondary data files. The
recommended file name extension for secondary data files is .ndf.
3. Log files
Log files hold all the log information that is used to recover the
database. There must be at least one log file for each database,
although there can be more than one. The recommended file name
extension for log files is .ldf.
Follow us on :
Database Files

Follow us on :
Index Allocation Mapping (IAM)
Index Allocation Mapping (IAM) is a system table structure
in SQL Server that tracks which extents belong to a
specific table or index.

 It helps SQL Server efficiently manage and allocate


storage for database objects.

SQL Server stores data in 8 KB pages, grouped into


extents (64 KB, 8 pages each). To track which pages
belong to a table or index, it uses IAM pages, which act as
a map of allocated extents.
Follow us on :
Index Allocation Mapping (IAM)

Each IAM page covers a 4 GB section of the database file.

It records which extents are allocated to a table or index.

It speeds up data retrieval and allocation decisions.

IAM pages are stored in the


sys.system_internals_allocation_units system catalog.

Follow us on :
Heap Structures
A heap is a table without a clustered index.

Heaps have one row in sys.partitions, with index_id = 0 for


each partition used by the heap.

By default, a heap has a single partition.

When a heap has multiple partitions, each partition has a


heap structure that contains the data for that specific
partition. For example, if a heap has four partitions, there
are four heap structures; one in each partition.
Follow us on :
Heap Structures
A heap stores data unordered across different
pages.

Data pages are linked via IAM (Index


Allocation Map) pages, which track allocated
space.

 In heap SQL Server scans all the pages to


find matching rows, resulting in table scan
which causes query performance degradation
for large tables.
Follow us on :
Heap Structures
It reads every single row even if few are needed.

Heap is best for small tables with few rows.

Suitable for Temporary/Staging tables during ETL process.

Bulk inserts where order does not matter

Follow us on :
Heap Structures
 A RID (Row Identifier) Lookup occurs when SQL Server retrieves
data from a heap (a table without a clustered index) using a Non-
Clustered Index but needs additional columns that are not included
in the index.

 The Non-Clustered Index only stores indexed columns + RID (Row


Identifier).

 If the query requests columns not present in the index, SQL Server
must go back to the heap to fetch the missing data.

 This extra lookup slows performance as it requires additional


reads. Follow us on :
Heap Structures
 RID in action!
1. SQL Server first searches the Non-Clustered Index (Index Seek).

2. It retrieves the RID (Row Identifier: FileID:PageID:SlotNumber).

3. RID Lookup happens to fetch the missing columns from the heap.

4. The result is returned.

Follow us on :
Clustered Table Organization
Clustered tables are tables that have a clustered index.

The data rows are stored in order based on the clustered


index key.

The clustered index is implemented as a B-tree index


structure that supports fast retrieval of the rows, based on
their clustered index key values.

Follow us on :
Clustered Index structures
In SQL Server, indexes are organized as
B-trees.
Each page in an index B-tree is called
an index node.
The top node of the B-tree is called the
root node.
The bottom level of nodes in the index
is called the leaf nodes. Any index
levels between the root and the leaf
nodes are collectively known as
intermediate levels.
Follow us on :
Clustered Index structures
In a clustered index, the leaf nodes contain the data pages of the
underlying table.
The root and intermediate level nodes contain index pages
holding index rows.
Each index row contains a key value and a pointer to either an
intermediate level page in the B-tree, or a data row in the leaf
level of the index. The pages in each level of the index are linked
in a doubly-linked list.

Clustered indexes have one row in sys.partitions,


with index_id = 1 for each partition used by the index.
Follow us on :
Clustered Index structures

Follow us on :
Non-Clustered Index structures
 Nonclustered indexes have the same
B-tree structure as clustered indexes,
except for the following significant
differences:
1. The data rows of the underlying table
are not sorted and stored in order
based on their nonclustered keys.
2. The leaf layer of a nonclustered index
is made up of index pages instead of
data pages.
Follow us on :
Non-Clustered Index structures
 Nonclustered indexes can be defined on a table or view
with a clustered index or a heap.

 Each index row in the nonclustered index contains the


nonclustered key value and a row locator.

 This locator points to the data row in the clustered


index or heap having the key value.

Follow us on :
Non-Clustered Index structures

 A row locator (either RID for Heap tables or EmployeeID for


Clustered tables).
 Searching for Name = 'Charlie' works like this: Start at the Root
Node [Charlie].
 Navigate to the correct Leaf Node [Charlie].
 Find the corresponding EmployeeID = 30.
 Use EmployeeID to fetch full row from the Clustered Index.
Follow us on :

You might also like