SQL Server - Pages & Extents - Architecture
SQL Server - Pages & Extents - Architecture
Pages The fundamental unit of data storage in SQL Server is the page. The disk space allocated to a data file (.mdf or .ndf) in a database is logically divided into pages numbered contiguously from 0 to n. Extents are a collection of eight physically contiguous pages and are used to efficiently manage the pages. All pages are stored in extents. In SQL Server, the page size is 8 KB. Each page begins with a 96-byte header that is used to store system information about the page. This information includes the page number, page type, the amount of free space on the page, and the allocation unit ID of the object that owns the page. Data rows are put on the page serially, starting immediately after the header. A row offset table starts at the end of the page, and each row offset table contains one entry for each row on the page. Each entry records how far the first byte of the row is from the start of the page. The entries in the row offset table are in reverse sequence from the sequence of the rows on the page.
The maximum amount of data and overhead that is contained in a single row on a page is 8,060 bytes (8 KB). Extents Extents are the basic unit in which space is managed. An extent is eight physically contiguous pages, or 64 KB. This means SQL Server databases have 16 extents per megabyte. SQL Server has two types of extents: Uniform extents are owned by a single object; all eight pages in the extent can only be used by the owning object. Mixed extents are shared by up to eight objects. Each of the eight pages in the extent can be owned by a different object.
A new table or index is generally allocated pages from mixed extents. When the table or index grows to the point that it has eight pages, it then switches to use uniform extents for subsequent allocations. If you create an index on an existing table that has enough rows to generate eight pages in the index, all allocations to the index are in uniform extents.
Managing Extent Allocations SQL Server uses two types of allocation maps to record the allocation of extents: Global Allocation Map (GAM)
GAM pages record what extents have been allocated. Each GAM covers 64,000 extents, or almost 4 GB of data. The GAM has one bit for each extent in the interval it covers. If the bit is 1, the extent is free; if the bit is 0, the extent is allocated. Shared Global Allocation Map (SGAM)
SGAM pages record which extents are currently being used as mixed extents and also have at least one unused page. Each SGAM covers 64,000 extents, or almost 4 GB of data. The SGAM has one bit for each extent in the interval it covers. If the bit is 1, the extent is being used as a mixed extent and has a free page. If the bit is 0, the extent is not used as a mixed extent, or it is a mixed extent and all its pages are being used. Each extent has the following bit patterns set in the GAM and SGAM, based on its current use. Current use of extent Free, not being used Uniform extent, or full mixed extent Mixed extent with free pages GAM bit setting 1 0 0 0 0 1 SGAM bit setting
Tracking Free Space Page Free Space (PFS) pages record the allocation status of each page, whether an individual page has been allocated, and the amount of free space on each page. The PFS has one byte for each page, recording whether the page is allocated, and if so, whether it is empty, 1 to 50 percent full, 51 to 80 percent full, 81 to 95 percent full, or 96 to 100 percent full.
The primary data file is the starting point of the database and points to the other files in the database. Every database has one primary data file. The recommended file name extension for primary data files is .mdf. Secondary data files
Secondary data files make up all the data files, other than the primary data file. Some databases may not have any secondary data files, while others have several secondary data files. The recommended file name extension for secondary data files is .ndf. Log files
Log files hold all the log information that is used to recover the database. There must be at least one log file for each database, although there can be more than one. The recommended file name extension for log files is .ldf. SQL Server does not enforce the .mdf, .ndf, and .ldf file name extensions, but these extensions help you identify the different kinds of files and their use. Database Filegroups Database objects and files can be grouped together in filegroups for allocation and administration purposes. There are two types of filegroups: Primary
The primary filegroup contains the primary data file and any other files not specifically assigned to another filegroup. All pages for the system tables are allocated in the primary filegroup. User-defined User-defined filegroups are any filegroups that are specified by using the FILEGROUP keyword in a CREATE DATABASE or ALTER DATABASE statement. Log files are never part of a filegroup. Log space is managed separately from data space.
Memory Architecture
All 32-bit applications have a 4-gigabyte (GB) process address space (32-bit addresses can map a maximum of 4 GB of memory). Microsoft Windows operating systems provide applications with access to 2 GB of process address space, specifically known as user mode virtual address space. All threads owned by an application share the same user mode virtual address space. The remaining 2 GB are reserved for the operating system (also known as kernel mode address space). All operating system editions starting with Windows 2000 Server, including Windows Server 2003, have a boot.ini switch that can provide applications with access to 3 GB of process address space, limiting the kernel mode address space to 1 GB.
Address Windowing Extensions (AWE) extend the capabilities of 32-bit applications by allowing access to as much physical memory as the operating system supports. AWE accomplishes this by mapping a subset of up to 64 GB into the user address space. Mapping between the application buffer pool and AWE-mapped memory is handled through manipulation of the Windows virtual memory tables. To enable support for 3 GB of user mode process space, you must add the /3gb parameter to the boot.ini file and reboot the computer, allowing the /3gb parameter to take effect. Setting this parameter allows user application threads to address 3 GB of process address space, and reserves 1 GB of process address space for the operating system. If there is more than 16 GB of physical memory available on a computer, the operating system needs 2 GB of process address space for system purposes and therefore can support only a 2 GB user mode address space. In order for AWE to use the memory range above 16 GB, be sure that the /3gb parameter is not in the boot.ini file. If it is, the operating system cannot address any memory above 16 GB.
As other applications are started on a computer running an instance of SQL Server, they consume memory and the amount of free physical memory drops below the SQL Server target. The instance of SQL Server adjusts its memory consumption. If another application is stopped and more memory becomes available, the instance of SQL Server increases the size of its memory allocation. SQL Server can free and acquire several megabytes of memory each second, allowing it to quickly adjust to memory allocation changes.
Buffer Management
A buffer is an 8-KB page in memory, the same size as a data or index page. Thus, the buffer cache is divided into 8-KB pages. The buffer manager manages the
functions for reading data or index pages from the database disk files into the buffer cache and writing modified pages back to disk. A page remains in the buffer cache until the buffer manager needs the buffer area to read in more data. Data is written back to disk only if it is modified. Data in the buffer cache can be modified multiple times before being written back to disk.
Using AWE
Microsoft SQL Server uses the Microsoft Windows Address Windowing Extensions (AWE) API to support very large amounts of physical memory. SQL Server can access up to 64 gigabytes (GB) of memory on Microsoft Windows 2000 Server and Microsoft Windows Server 2003. AWE is a set of extensions to the memory management functions of Windows that allow applications to address more memory than the 2-3 GB that is available through standard 32-bit addressing. AWE lets applications acquire physical memory, and then dynamically map views of the nonpaged memory to the 32-bit address space. Although the 32-bit address space is limited to 4 GB, the nonpaged memory can be much larger. This enables memory-intensive applications, such as large database systems, to address more memory than can be supported in a 32-bit address space. Before you configure the operating system for AWE, consider the following: AWE allows allocating physical memory over 4 GB on 32-bit architecture. AWE should be used only when available physical memory is greater than user-mode virtual address space. To support more than 4 GB of physical memory on 32-bit operating systems, you must add the /pae parameter to the Boot.ini file and reboot the computer. For more information, see your Windows documentation. If there is more than 16 GB of physical memory available on a computer, the operating system requires 2 GB of virtual address space for system purposes and therefore can support only a 2 GB user mode virtual address space. For the operating system to use the memory range above 16 GB, be sure that the /3gb parameter is not in the Boot.ini file. If it is, the operating system cannot use any physical memory above 16 GB.