Deep Dive Into SQL Server
Deep Dive Into SQL Server
DATABASE
[ behind the curtain ]
files get
FooBar.ldf are two
binary files that store
created! EVERYTHING that
your DB contains!
C:\BitesizedEngineering
Pages &
Extents
Ever wondered what the content inside the .mdf file
is? SQL Server manages memory by splitting it into
chunks called "Pages" and then groups them into
sequences called "Extents". It's actually quite
interesting, so let's dive deeper into it!
Page (8KB)
Page is a Unit
Page Header of Memory
Mgmt!
Data row 1
SQL Server stores all of your
Data row 2 data (tables, rows, indexes,
etc.) using units of storage
called "Pages". You can think
of them as Book Pages -
Data row n single page contains
paragraphs, headings,
Free
images, etc.
space
slot array
8 consecutive
Data row 1 Data row 1 Data row 1 Data row 1
Data row 1
Page Header
Data row 1
Page Header
Data row 1
Page Header
Data row 1
"Extent"
Data row 2 Data row 2 Data row 2 Data row 2
Extent (64KB)
Data, Index,
LOB & GAM
are most
Data pages contain data itself,
Index pages contain info about
interesting Indexes, LOB pages contain
types Large Objects and GAM is a
Global Allocation Map which
tells you which extents are free
for further use!
You Should
You Should
Know!
Know!
Concept of Pages is not unique to
databases. It has it's origins in Virtual Pages are not unique to SQL Server
Memory which is how OS represents but are used in other relational DB
the memory you have available! solutions as well
BUFFER POOL
[ explained ]
Buffer Pool represents an in-memory cache storage
that stores all of the recently used Pages! As such, it is
one of the most important parts of the SQL Server's
DB Engine. Let's see how it works.
Gimme
data row
with ID 5
Data is always
fetched from Same for
Buffer Pool! WRITING!
Whenever you want to read a It's the same! Every time you
row, SQL Server first checks if update a row, it's immediately
the Page with given row exists stored to Buffer Pool. In
in Buffer Pool. If yes - it's parallel, you have async process
returned immediately. If not - it running that once in a while
is read from HDD and stored in flushes the Buffer Pool content
buffer pool immediately! to disk.
Update
rows with
IDs 1, 2, 3
SQL Server
Buffer Pool
But, why?
Because SQL Server is I/O heavy and
Ideal size?
writing to RAM is order of magnitude
faster than writing to HDD! By using in-
memory cache (i.e. Buffer Pool), SQL
The bigger the better!
Server makes both reads and writes
super fast!
Enterprise version allows 32
times max_server_memory,
whereas Standard version
allows 4 times
What if data
max_server_memory.
is lost?
As we all know, RAM is volatile and is prone
to losing it's data! Luckily, SQL Server is
aware of it and this issue is solved using a
technique called "write-ahead logging!".
We'll go into more details on what it is in
the next article!
C:\BitesizedEngineering
WRITE-AHEAD
LOG
But,
why? Because
Durability!!
You wouldn't expect your DB to
lose your data, would you? Heck,
ensuring your data will be there no
matter WHAT happens is one of the
primary guarantees of any durable
DB engine! It's the D in ACID!
But,
how?
Save first,
process
afterwards!
As stated above idea is extremely simple,
whenever client sends a WRITE command (e.g.
INSERT, DELETE, ALTER, CREATE INDEX, etc.)
you IMMEDIATELY append it in it's row form to
Benefits?
a log file. This ensures that this file contains
chained list of commands in the order that they
were received from the client. ONLY after the
HDD confirms that write has, indeed, happened,
may DB process doing other things! And this has
MANY BENEFITS as you will see!
Crash
Recovery!
Should anything go wrong after you write
data to transaction log (e.g. SQL Server
Replication! dies abruptly), you are 100% safe that you
can RECOVER from it by simply REDOING
everything from the transaction log!
Want to make a clone of your Literally just apply the SQL statements in
current state? Just copy all the the same order they were received and,
commands that led to your current guess what - you end up in the same place
state, replay them and - voila! you were before! It's amazing really!
You're at the same state as the
Primary replica! Again - super
Auditing!
simple and extremely effective!
8 KB 8 KB 8 KB 8 KB
Extent (64 KB)
Mixed Uniform
Extents Extents
Mixed
Extent
Uniform
Extent
C:\BitesizedEngineering
How is Disk Space
Being managed
[ story of Allocation Maps ]
We've so far learned that SQL Server stores everything in
Pages, which are then grouped in Extents. But how does
SQL Server know which extent are in use, and which are
free? And what belongs to what really? That's what
Allocation Maps are for! Let's dive into them.
What are
They are just
another type of
Allocation Page! Page whose
Maps? job is to do the
boring accounting
work!
There are
Global Allocation Maps
(GAMs), Shared Global
Allocation Maps Three Main
(SGAMs) and Page Types
Free Space (PFS).
GAM
Global Allocation Maps are
highest level of organization.
GAM
They are simply tracking
which Extents are in-use and
which ones are free. That's it!
IAM
Index Allocation Maps are
most detailed ones! They exist
for each Table and keep track
of table data and allocation
units across the pages.
How are Large
Rows stored?
How?
C:\BitesizedEngineering
8 KB
500 KB Page
Data
Types of
this one is limited to Page size.
Second one is OVERFLOW_DATA
Data Pages! which gets split from IN_ROW
data and stores only the overflow
1. 2OVER
. 3.
columns. Third one is Large Object
Data (LOB) that stores Texts,
IN
LOB
Images, etc.
ROW FLOW
one IN_ROW
500 KB
Data
Your row can't
exCeed 8kb in size.!
[ except that it can! ]
C:\BitesizedEngineering
>8000 bytes? It gives an error - column size
must be <= 8000 bytes. That's because row
has to fit into a single Page (8KB). Yet, you
can define MULTIPLE VARCHAR(8000)
columns and SQL server is happy to do it.
What's the trick?
the details!
and you'll get an error. But if
you change other column to
VARIABLE-sized one (e.g.
VARCHAR(800)) it will work.
The devil strikes!
OVERFLOW
page size (8KB), get moved to
ANOTHER page (called
IS BORN! ROW_OVERFLOW), whereas the
original page with 'base data'
remains on initial IN_ROW page.
This IN_ROW page will contain a
pointer to new page so that SQL
Server knows where to find
'additional' data.
If your row spans multiple pages,
then Engine has to read all those
pages in order to fetch all the
necessary data! And that adds
unnecessary I/O which affects Mind the
performance! For max performance, Performance!
always aim at fitting your data into
IN_ROW allocations if possible!
2 GB sized
columns?
[ story of LOBs ]
8KB is max
column size you
can define But
varchar(max)
allows up to
We mentioned in previous
article that 8000 bytes is the
2gigs!
maximum size of the column
you can specify. And that's Huh would be a proper reaction,
because a single column has to right? What is
fit on a single Page (8KB in VARCHAR(MAX)? Turns out it's
size). a huge BLOB that can store up
to 2 gigs! And it can do so
because it uses storage referred
to as LOB data.
What is LOB
data?
It's just a THIRD type of data
row, alongside with IN_ROW LOBs allow you
and ROW_OVERFLOW. It splits to store large
data. But With
your data into bunch of 8KB
chunks that get spread over
a penalty!
multiple pages. And max size is
2 gigs btw! By defining VARCHAR(MAX) or
NVARCHAR(MAX) you are
effectively defining a LOB
column that can hold up to 2
gigs of data. But that comes
with a performance cost!
Why
performance
cost?
Your data ends up being spread
over multiple pages! And given
that single page can hold up to
8KBs, that means that 2 gigs of
data would be spread accross Should I
TONS of pages! That's a LOT of avoid it?
data to iterate through to fetch
the value!
If you can - sure! It just takes
tons of space and requires
additional IAM pages to track
them. But, obviously, if you
IAM to
ones that actually track
which extents contain data
from your table. And that's the rescue!
their only job!
There's
one IAM Each table gets its own IAM
per table Page(s).
IAM
That means that each IAM can
track up to 4GB of Extents
Chains?
allocated to your table. If your
table is bigger than that - you'll
need additional IAM page(s).
Hence, they can be chained for
as long as needed to cover all
table data! C:\BitesizedEngineering
FileGroups &
Partitions
[ Divide & Conquer at its finest ]
Why?
Instead of having a single
huge binary blob, you split
your DB into bunch of
smaller files. Smaller file =
faster scan, easier backup,
etc. Divide & Conquery, baby!
NoPartitions.mdf YesPartitions.mdf
Table Table Table Table
Table A Table B
A A B B
Table Table
A B
C:\BitesizedEngineering