Basic Concepts On Database Objects, File Systems, Storage Structure and Query Processing
Basic Concepts On Database Objects, File Systems, Storage Structure and Query Processing
Topics discussed
1.
2.
3.
Data organization
4.
Query processing
System Databases
TempDb : Stores all the temporary data from start to end of the
SQL server instance that are used by the other databases. This
database also inherits the model database.
Database Files
Extents
Pages
Database Files
A database is comprised of three
kinds of files
.mdf file : Primary data file.
Contains data
.ndf file : Secondary or other
data files. Optional. Contains
Data.
File Groups
Pages
At the most basic level, every data is stored in
8KB pages. Each page has a header portion
and data is stored in the body portion
Extents
All database objects are stored in Extent level (Not at page level)
Extents : Continued
Allocating extent instead of pages to objects
serves some important purpose
Types of Pages
8 types of pages are there. Most important type is the Data page.
Data pages Hold the actual data records. A Page may contain one or more
data rows
Data rows are put on the page serially, starting immediately after the
header. A row offset table starts at the end of the page, and each row offset
table contains one entry for each row on the page.
Rows are not allowed to span more than one page. When a page is not
sufficient to contain a variable length column, the rest data is moved to a
page in the ROW_OVERFLOW_DATA allocation unit, and, a 24 bit pointer
is left behind
The Page header contains info about the page number, page type free space,
owner object info etc
Data organization
Data in the tables in SQL server are organized in two
ways:
Heap:
Data organization :
Continued
Indexed table:
If a table as a clustered index (Primary key), then, the
table is said to be an indexed table.
Types of Indexes
Types of indexes :
Continued
Non Clustered :
Allocation Unit
An allocation unit is a collection of pages to manage data based on
their page type. types of allocation units are:
IN_ROW_DATA : Used to manage Data or index rows that contain all
data, except large object (LOB) data
LOB_DATA: Used to manage Large object data stored in one or
more of these data types: text,ntext, image, xml, varchar(max),
nvarchar(max), varbinary(max)
ROW_OVERFLOW_DATA: Used to manage Variable length data
stored in varchar, nvarchar, varbinary, or sql_variant columns that
exceed the 8,060 byte row size limit
Data Insertion
When the SQL Server Database Engine has to
insert a new row :
If space is available in the current page, the row is
inserted there
If not, The Database Engine uses the IAM
pages to find the extents allocated to the
appropriate allocation unit
For each extent, the Database Engine searches the PFS
pages to see if there is a page that can be used
IAM and PFS pages covers lots of extents and pages, so,
they are few in numbers. So, these are recides in
memory in the SQL Server buffer pool, so they can be
searched quickly
Query processing
The SQL Server query
processor consists of two
components:
The query optimizer :
Responsible for generating good
query plans, using the input SQL
and database statistics
The query execution engine:
Takes the query plans generated
By the query optimizer and, as
Its name suggests, runs them
Iterators/Operators
SQL Server breaks queries down into a set of
fundamental building blocks that we call operators or
iterators.
Execution plan
optimization
Query Plan
The bulk of the execution plan is a, read-only data structure
used by any number of users. This is referred to as the query
plan. No user context is stored in the query plan.
Execution Context
Each user that is currently executing the query has a data
structure that holds the data specific to their execution, such
as parameter values. This data structure is referred to as the
execution context.