Sybase Data Storage & Fragmentation
Sybase Data Storage & Fragmentation
& Fragmentation
Software Gems Pty Ltd
Derek Asirvadem
V2.5
04 Sep 12
Sybase Data Storage & Fragmentation
Introduction
Purpose
1 This Software Gems document defines the physical elements of a Sybase ASE database; assists in the understanding the terminology in the manuals,
and the operation of ASE. Indeed, it overcomes the problem of abysmal manuals in that subject matter.
2 There is an awful lot of shallow, inaccurate, misleading and false information on the Internet. Unfortunately some of that false or misleading
information is published by Sybase, both in the manuals, and on the web. This document is therefore rendered to provide full and complete
information (albeit very condensed), such that the reader is no longer vulnerable to false or confusing information on the subject.
Structure
This document combines three closely related HTML documents into a single PDF, and resolves the links. It remains in three Parts, with a single
numbering scheme (19 chapters) throughout (Levels are numbered in Roman numerals). When it is relevant, the section presents APL vs DPL/DRL
LockSchemes separately. The definitions are Normalised, and cross-referenced. Virtually all objects can be selected, to open further detail.
Sybase Data Storage
The elements of data storage units, their relations, and their types. This is a pre-requisite to the second part.
1 Unit
Units of data storage, their relations, the hierarchy
2 DataStructure
The five possible DataStructures that constitute a table, four of which are fully illustrated and examined
3.1 Heap
3.2 Clustered Index Education
3.3 Nonclustered Index • This document is actually a consolidated version of
3.4 Placement Index a selection of the Memory Tag pages from our
4 Data Model/Catalogue courses.
Explains the entities in the Sybase ASE catalogue that pertain to Data Storage • We do not provide ordinary SQL and Sybase
5 Data Model/DataStruct courses, there are many providers.
Presents all the elements relevant to Data Storage in the form of a Relational Data Model • However, as true performance experts, we provide
Sybase Fragmentation specialist Sybase Quality & Performance courses
Definition & identification of the three distinct levels of fragmentation & the types within them; at both the DBA and Developer level, which
determination of each level/type; followed by chapters for each level/type allow you to take full advantage of your software
6 Definition investment.
Defines Fragmentation, Levels, terminology and differentiates the types • We also provide high performance, standard-
7 Determination Level I Level II Level III Partition compliant Relational Database Design and
Guidance on the accurate determination of each Level/Type of Fragmentation education.
8 I Allocation Unit • There is no substitute for formal, qualified
Identifies Fragmentation in AllocationUnits & Extents within AllocationUnits education. Please inquire if you need further
9 I Drop-Create detail, or you have an interest in improving your
Why Drop-Create Clustered Index does not return Asynch Pre-Fetch & Large I/O Sybase performance or SQL coding.
10 I Segment • As such, they are detailed, very condensed and
The value of Segments complete, but of course, the scope is limited.
12 II Page Chain
Identifies and discusses Fragmentation in the Page Chain Manual
13 II Overflow Page These documents are provided to complement the
Identifies and discusses Fragmentation in Overflow Pages Sybase manuals, and to correct them, as follows:
14 II Unused Space/Extent • they contain information that is not in the manuals
Identifies Fragmentation in Unused Space in Extents (ie. they overcome the lack of information)
15 II Unused Space/Page • where the manuals contain contradictory
Identifies Fragmentation in Unused Space in Pages information, the correct version only, is provided,
17 III Page the goal is to eliminate confusion and half-truths !
Identifies Level III Fragmentation (DOL only): Rows within Pages, displaced rows • where misleading or false technical terms are used,
19 Index Type correct technical terms are used instead
Compares APL vs DOL from an Index Type perspective. • they bring all the relevant information about a
subject together, in one place
Document Status
What was once a few single pages made available on the web, due to interaction with the Sybase community, has been consolidated into a single
document, and expanded. It remains a collection of diagrams from our course documents, a terse, condensed, diagrammatic style; rather than one of our
usual polished final documents, that some of your have come to expect. Progress (adding diagrams and explanatory test) is made between assignments,
based on questions and feedback received.
Version
V2.0 12 Sep 11 Consolidation of three previous docs; full exposition to 14 pages; first open publication; enabled HTML Image Map
V2.5 28 Mar 12 Data Storage (now 9p); Definition & Determination added (8p); Fragmentation (now 12p); PDF version (now 31p).
It is valid for Sybase ASE versions 12.5.4.x and 15.x. Yes.
Copyright
The entire document is the property of, and copyright, Software Gems Pty Ltd. It is provided free of charge to assist the Sybase community in server
and database administration, where no fee is charged. Permission is granted to copy or distribute this document, as long as it remains unaltered; with
the copyright notice intact; due credit is given to the author; and the distribution and ensuing consultation remains free. Contact us re commercial use.
Moral Right & Contact
The author is Derek Asirvadem, Information Architect and Sybase performance specialist, he is solely responsible for the content. He welcomes
constructive commentary and answers questions for professionals (click the link at the bottom of the page).
2 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
1 Data Storage Unit
First we need to understand the different Data Storage Elements, what they contain, how they relate to each other, and their Units of
Measure. This is presented in its natural hierarchy, from top to bottom, largest to smallest, and identifies the Pages used to control space management.
v Devices
Database specified in MB
s Segments
d DataStructures
Forwarded Row
Deleted Row
Page
Rows
DOL Heap (Always) APL Heap (When No CI) Clustered Index Nonclustered Index Text/Image Chain
• RowIds do not change • Rows shifted on Expand/ • Index/Row Order maintained (including Placement Index) • The entries are the content
• DELETES Marked but not Contract/INSERT/DELETE • Rows shifted on Expand/ • Index Entries & RowId of a single Row/Column
Removed • Chronological Order Contract/INSERT/DELETE • Entries shifted on • Allocated in units of Pages
• Expanded Rows Forwarded • INSERTS at end • Heap eliminated INSERT/DELETE
• Interspersed INSERTS at end • Page Splits when Full for
• No Clustered Index interspersed INSERTS
1.1 AllocationPage
• The first page of each AllocationUnit contains the AllocationPage, it identifies:
• the 32 Extents that it contains
• the Physical DataStructure residing in each Extent (identified by ObjectId, IndexId and PartitionId)
• pointers to the OAMPages of those 32 Physical Datastructure, and
• the space available in each Extents, and in each Page of each Extent.
1.2 ObjectAllocationMap
• Just as the first Page of an AllocationUnit is the AllocationPage, the first Page of a DataStructure is the O ObjectAllocMap
ObjectAllocationMap ▶AU0 ▶A ▶Extent
• It contains a linked list of the AllocationUnits in which Extents belonging to the DataStructure reside. ▶AU256 ▶A ▶Extent
▶AU1024 ▶A ▶Extent
• The AllocationPage of each AllocationUnit is then interrogated to locate the Extent.
▶AU512 ▶A ▶Extent
• The AllocationPage identifies which Extents & Pages have free space. If such exists, this allows rows in the ▶AU1280 ▶A ▶Extent
DataStructure to be placed close to other rows, however it is quite independent of rows in other DataStructures. ▶AU768 ▶A ▶Extent
• If more than one Page is required for the OAM, a linked list of OAMs is provided
• While the OAM provides a second access path to the DataStructure, it is especially relied upon during Table Scans of
DOL Heaps, since they do not have PageChains.
Intro DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 3 of 32
Sybase Data Storage & Fragmentation
2 DataStructure
This chapter introduces Sybase ASE DataStructures, again in logical order, and illustrates how they relate to each other.
1. Table
The catalogue tables may be easier to
• a Table has a single entry in sysobjects WHERE type U
understand if they had been named:
• the Primary Key is (id), as in OBJECT_ID()or ObjectId
• sysindexes
• a Table is a collection of Logical DataStructures sysLogicalStruct
2. Logical DataStructure • syspartitions
sysPhysicalStruct
• each Logical DataStructure has a single entry in sysindexes, which defines its logical structure, keys, etc
• the Primary Key is (id, indid), indid identifies the DataStructure Type
• There are five types of Logical DataStructure (the APL Heap and DOL Heap are very different, as detailed in the next
chapter):
sysobjects.
id/U
Table
LockScheme DPL/DRL APL Any
Logical DOL Heap APL Heap Clustered Index Nonclustered Index Text/Image Chain
DataStructure • Always • Only when no CI • Eliminates the Heap • one for all Text/Image
Type columns in the table
3. Physical DataStructure
• each Logical DataStructure is rendered physically as one or more Physical DataStructures
• the Heap or Clustered Index, which contains data rows, may be divided into several Physical DataStructures, called Partitions
• the Nonclustered Index and Text/Image Chain are not Partitioned
sysindexes.
indid
Partitions Partitions Partitions Nonclustered Index Text/Image Chain
• each Physical DataStructure has a single entry in syspartitions, which defines its physical structure, Data Storage location, etc
• hence the silliness in the manuals that "unpartitioned objects have one partition"
• the Primary Key is (id, indid, partitionid)
syspartitions.
partitionid
DOL Heap Partition APL Heap Partition CI Partition Nonclustered Index Text/Image Chain
4. Partitioned DataStructure
There are, therefore, five types of Physical DataStructure, and the Heap or the CI may be Partitioned. During the discussion of logical or
physical DataStructures, non-
5. In summary, a DataStructure is
technical terms such as
• an independent Data Storage structure that is 'table',
• first, belongs to a Table (ObjectId) 'base table' and
• second, one of five logical types (IndexId) 'object-index pair'
• third, a physical structure, which may be a Partition (PartitionId) are too ambiguous to be meaningful:
those who use them are committed to
your continued confusion.
Intro Unit Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
4 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
2 DataStructure
6. The five types of Physical DataStructure, the first three of which may be Partitioned, are located on Devices, which are identified by Segment:
sysindexes.
indid
Partitions Partitions Partitions Nonclustered Index Text/Image Chain
syspartitions.
partitionid
DOL Heap Partition APL Heap Partition CI Partition Nonclustered Index Text/Image Chain
2.1 Segment
A Segment 1 is a logical group of one or more Devices, within a database. A good Segment Plan has two fundamental purposes:
1. It allows DataStructures to be distributed for load balancing purposes:
• separating the data (CI or Heap) of a single table from its related NCIs
• separating the different tables within a Transaction
• separating the Partitions of a table, in order to support full parallelism
2. It drastically reduces Level I and II Fragmentation, which would otherwise be massive.
3. Either a Logical DataStructure (all Partitions in the DataStructure) or a Physical DataStructure (a single Partition) may be placed on a Segment.
• placing all the Partitions of a DataStructure on one Segment/Device has the same I/O contention as an unpartitioned DataStructure (shown)
• placing each Partition of a DataStructure on a separate Segment/Device eliminates that contention, and maximises parallelism (not shown)
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 5 of 32
Sybase Data Storage & Fragmentation
3.1 Heap
This chapter discusses the APL Heap and the DOL Heap, and their characteristics.
Scans
via OAM
Heap Method
Heap
Row Row
• Table scans via PageChain • Table scans via OAM method only
• INSERTS are placed at the end of the Heap • RowIds do not change
• Pages are kept trim; rows are contiguous • Deleted rows are marked for delete but not deleted (they are deleted,
• Rows within the Page are shifted upon DELETE and UPDATE and the space is reclaimed, during REORG or aggressive Garbage
(Row Expansion/Contraction) Collection)
• Row Expansion may cause it to be moved to the end of the Heap, • If space is available in the current Page or Extent of the Heap (as a
changing the RowId) result of reserving same), the Forwarded Row or interspersed INSERT is
• If there are NCIs, the RowIds need to be updated placed there; otherwise (the usual case) it is placed at the end of the
Heap. The intended and actual locations are nowhere "near" the
original location and nowhere "near the Placement Index, refer to
section [8.3] and [9.5]. Forwards accumulate in Overflow Pages.
Pages
• When a row is Forwarded, the NCIs (including the PI) must access the
original location, to obtain the forward address, then access the
Forwarded Row.
• Contracted Rows are not repatriated
INSERTed INSERTed
Rows at End Forwards: Rows at End
Overflow
Pages
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
6 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
3.2 Clustered Index
This section discusses the Clustered Index, and its characteristics.
Clustered
Index is
Leaf Page Sparse
Clust
Page Chain
Index
at Every
B Tree Index Level
Rows
Leaf Level
is Data Row
Leaf Level
B-Tree Entry Row
• The Index B-Tree is clustered with the data rows, into a single • Despite the demanded "clustered" syntax, there is no such thing as a DOL
DataStructure "clustered" index or DOL "clustered" table. The DataStructure addressed
• The Leaf level of the B-Tree is the data row (put another way, in is fact a Placement Index.
there is no Leaf level, the B-Tree is clustered with the data rows) • There is nothing remotely like the Clustered Index available for DOL
• Creation of the CI eliminates the Heap; dropping the CI returns tables.
the Heap
• One less logical Read on every access Confirmation
• There are still two OAMs to allow independent access If anyone suggests that DOL "clustered" indices do exist, run this
• All the DataStructures belonging to an APL table are Clustered simple query on a database that has both APL Clustered Indices and DOL
Index based "clustered" indices. Study the DataStructure chapter, along with the
• Index order = Row Order report, and ask them why, as far as Sybase ASE internally is concerned:
• Rows are distributed as per Index Key, and remain so • Clustered Indices always appear without a Heap
• Designed for • Heaps always appear without a Clustered Index
• Relational Keys (compound or composite keys) • Placement Indices are Nonclustered Indices
• Range Queries • Placement Indices always appear with a Heap (which means they are
• INSERTS into Key location: two separate Logical, and therefore Physical, DataStructures)
• For Interspersed INSERTS, if the page is full, a Page Split is Such persons evidently have little technical knowledge os Sybase.
necessary, and the RowIds (in the split Page) which are
referenced in any NCIs must be updated All the technical evidence from all the functions and catalogue
• Pages are kept trimmed components, is consistent. Even a simple query demonstrates the truth. It
• On Expand/Contract/INSERT/DELETE Rows in the CI may be can be extended to show other items as desired.
shifted within a Page, without additional overhead, maintaining
free space in the page
• According to the Relational Model, rows in a table must be unique.
The Clustered Index is designed for Relational tables, and to be
unique, and therefore should be
• Non-unique keys cause Overflow Pages .
A man and a woman are meant to be married; together they achieve more
than each achieves separately. Implementing APL tables without a Clustered
Index, is analogous to a divorced couple. Likewise, there is no fidelity in
non-unique Clustered Indices .
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 7 of 32
Sybase Data Storage & Fragmentation
3.3 Nonclustered Index
This chapter discusses the Nonclustered Index, and its characteristics under the different LockSchemes.
Nonclustered Nonclustered
Indices are Indices are
Dense Dense
• If there is no space available in the NCI for interspersed INSERTS, • If there is no space available in the NCI for interspersed INSERTS, the
the Index Page must be split. Index Page must be split.
• This disturbs the PageChain • This disturbs the PageChain
• The NCI conatins the RowId in the CI; when the row moves (as the • The Placement Index is a Nonclustered Index, with a couple of
CI is re-ordered and kept trim), the NCIs need to be updated. additional attributes.
• The NCI conatins the RowId in the Heap; the rows do not move, and so
there is nothing to update in the NCIs (including the PI). This is better
stated as, in order to eliminate updating the NCIs, the rows in the Heap
are designed to be static.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
8 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
3.4 Placement Index
This chapter discusses the Placement Index, and its characteristics.
Page Chain
at Leaf
Level Only
No Page
Heap Chain:
NCI
Scans
B Tree
must use
OAM
method
Leaf Level
B-Tree Entry IndexKey RowId Data Row
There is no equivalent on the APL side. A rough equivalent would be: DOL tables always have a Heap.Heap They may have a single Placement Index.
• a Heap (ie. where a Clustered Index has been actively avoided, It is a Nonclustered Index
Index (there is no structural difference), a separate
thereby crippling it). DataStructure to the Heap, with two additional criteria:
• but even then the APL Heap has a PageChain, providing faster 1. It identifies the initial placement of rows in the Heap
scans 2. Any settings made, such as placement ON segment and FILLFACTOR,
• plus a Nonclustered Index apply to the Heap as well.
As such, its relationship to the Heap is slightly closer than that of other
The Placement Index is not comparable to a Clustered Index, which is NCIs, but that does not constitute clustering ala Clustered Index; a term
available only for APL which existed before its advent;. Note that they are separate by design.
• It has no clustering (as per the definition of that term since 1984); the B-
Tree is not clustered with the data rows, forming a single physical • This initial row placement is not maintained under:
DataStructure; it remains a separate DataStructure to the Heap • interspersed INSERTS
• There is no such thing as a DOL "clustered" Index • DELETES and
• The use of the term "clustered" Index in relation to DOL tables is therefore • UPDATES that cause Row Expansion
incorrect, confusing, and fraudulent.
• The correct term, as per some, but not all, Sybase documentation, is • The Index & Heap remain two separate DataStructures; two OAMs
Placement Index • Two Logical Reads on every access (via any NCI, including the PI)
• Unfortunately, to address the Placement Index or the Heap, one is • Key order in each NCI is maintained, but Row order in the Heap cannot
required to use the "clustered" syntax. Talk about forced confusion. be maintained
• The Heap is Static RowId based
based.
• Other than to rebuild the Heap, there is no value in a Placement Index
• Range Queries are not possible, since it is not a Clustered Index (there is
no order to the Heap, and it does not have a PageChain).
• Ideal for non-relational Keys (surrogates, monotonic)
DOL tables have an additional third level of Fragmentation
Fragmentation, they get
fragmented at this level very quickly, and require regular REORG. The above
illustrates a fresh, unfragmented Heap and Placement Index; section [18]
18
illustrates a fragmented Heap and Placement Index.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 9 of 32
Sybase Data Storage & Fragmentation
4 Data Model • Catalogue
A formal Relational Data Model is the best way to understand data, and its relations. This chapter presents the entities in the catalogue that pertain to
Data Storage elements, in terms of a formal Data Model (Entity Relation level), rendered in IDEF1X. Specifically, it shows the catalogue in which
information about each Data Storage Unit is stored.
Distribution Logical Physical
Locates
[1]
This models the normal case: exceptional cases, such as the mandatory logsegment, which may or may not be correctly deployed, are not differentiated.
IDEF1X Notation
10 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
5 Data Model • DataStructure
This chapter exposes the five types of DataStructures, starting from the catalogue, in terms of a formal Data Model (ER level).
Table
sysobjects=U
NCI Has
Leaf
1
DOL
Object Heap
Allocation Deleted
Map Locates DOL Row
(RowId) May Contain May
Be
Locates [1]
DOL May Be Forwarded
Row DOL Row
Allocation Allocation
Unit Page
3
May Contain CI
Identifies [1] Has
B-Tree
Based on IndexId
CI Leaf Has
May Contain (Row)
2
APL
Heap Has
May Contain
APL Heap
Row
5
Text/Image
Is
Chain
Page
There are always at least two paths to the data. That a Page belongs to a specific DataStructrure is directly identifiable (grey
relation); but the DataStructure consisting of Pages is not directly identifiable by this means. The PageChain or OAM provides that.
IDEF1X Notation
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 11 of 32
Sybase Data Storage & Fragmentation
6 Definition
This document defines and discusses all aspects of Fragmentation, in substantial detail (albeit condensed) as it occurs in Sybase ASE.
The document is laid out as follows:
• this introduction, containing definitions and approach
• the impact of fragmented DataStructures
• Definition of every Type of Fragmentation, within each of the three Levels
• four sections identifying how Fragmentation can be determined accurately, and without confusion, fully detailed
• a section on evaluation of the various determinants
• an additional section of issues relating to Partitioned DataStructures
• eleven sections discussing the different Types of Fragmentation within each Level, fully illustrated and discussed
In particular, the level of detail provides information so that Fragmentation can be fully understood and therefore prevented, and leads up to why common
methods of correcting Fragmentation do not work. Put another way, the detail identifies why Fragmentation must be addressed using an overall approach,
at all three levels, if substantial performance gains are sought. It is not a point problem, and therefore point solutions do not apply.
Understanding the Data Storage structures that Sybase uses, is a pre-requisite to understanding Fragmentation.
• A table does not exist physically, it exists as a collection of Physical DataStructures: when a query is executed, it is the DataStrucures that belong
to the table that are accessed. In order to administer tables efficiently, the DataStructures and how they are accessed, must be clearly understood.
Level
The three Levels of Fragmentation are quite independent of each other, and can be differentiated easily. It is quite possible for a DataStructure to be
fragmented at one Level and free of Fragmentation at another Level: indeed, each Level requires quite different correction operations, and they
affect only that Level. The highest performance is obtained when all three levels are addressed.
Frequency
The frequency of correction operations for each Level, is also different: It is normal to de-fragment a
• Level III de-fragmentation (REORG REBUILD or DROP/CREATE CI or "CI") is required weekly at a minimum. DataStructure at Level II because
it is demanded presently, but to
• Level II is dependent on leave a full de-fragmentation
a. whether a good Segment plan has been implemented, and operation of Level I to a separate
b. the turnover within the DataStructure. maintenance window, addressing
The frequency required varies from monthly to annually. A good Segment plan and a well designed Clustered many DataStructures together,
Index may well eliminate the need for de-fragmentation altogether. because it requires reasonable
planning and the scripts require
• Level I de-fragmentation is required once, if it is done properly. It provides testing, etc.
a. the basis for reduced fragmentation at Level II
b. reduced frequency of Level II de-fragmentation operations, because it renders the correction operations at
Level II more permanent.
What it is Not
Administrators are sometimes confused by the masses of misinformation either available on the internet, or presented by Storage Teams who are
avoiding work, or hardware salesmen who are selling something on the false basis that it will result in less work for the DBA. To address this, it is
important to understand what Fragmentation is not:
• Hardware Striping equals Fragmentation
The SAN (or Logical Volume Manager) and Sybase ASE are completely independent of each other. ASE treats the Logical Volume as a
contiguous series of disk blocks. Whether the LV is striped or not is irrelevant to ASE; Fragmentation; performance; etc. Striping affects only the
speed of the LV within the hardware unit. De-fragmentation operations within ASE reclaims performance within ASE.
• If you use a SAN, you don't need Segments
See above. Total lack of technical ability and logic. My father works 50 hours a week, therefore your father does not need to work.
• Partitions equals Fragmentation
When the Partitions of a table (Physical DataStructure) are placed on several Devices or Segments, for performance purposes, by design, it is
distribution not fragmentation, and the result is substantially different to the fragmentation that occurs when there is no design.
• Data Distribution equals Fragmentation
Substantial performance can be gained in Relational tables when the Key (usually composite Keys) is used to distribute the data 1, and therefore
decrease contention. That is again, by design, and space must be reserved for interspersed INSERTS. Such reserved space is not the same as
unused or waste space, which cannot be used for interspersed INSERTS.
What it Is
Level I
Database Fragmentation: the unplanned or unconscious occupation of space, and the disturbed contiguity, of DataStructures across the Database.
Level II
DataStructure Fragmentation: the unplanned or unconscious occupation of space, and the disturbed contiguity, within the DataStructures.
Level III
Page Fragmentation: the unplanned or unconscious occupation of space, and the disturbed contiguity, within the DataStructures, in systems that
have been implemented quickly and without OLTP Standards or Relational technology.
1. That is not possible in record filing systems, where surrogate keys (single-column; monotontic) are used across the board.
Intro Unit DataStruct II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
12 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
6.1 Impact
This document is written for the qualified Sybase Database Administrator, and the subject is Fragmentation. As such, it does not detail how
the I/O subsystem; disk resources; caches and their configuration; etc, operate. It is expected the the reader understands all that, and therefore appreciates
the relevance of maintaining DataStructures in an un-fragmented state. However, there are basic features within Sybase ASE, that are commonly
unappreciated and therefore unused. It is a shame that in many sites, Sybase operates at a mere fraction of the speed that it is capable of.
Two such features that are fundamental to ASE delivering great speed when accessing the DataStructures, are described here.
The impact of fragmentation is usually a subjective issue: people are used to a certain level of response from their queries, when the database contains a
somewhat higher population than it did during the initial testing, the response slows down. It is an awareness that is quite real, but unscientific.
• the loss of speed is certainly the result of naïve server installation and configuration, and a lack of planning and configuration at the Device and
Segment levels
• that loss of speed is not necessary: the server and its resources can be configured, such that response does not slow down with population, even
with very large tables 2
7 details the accurate determination of fragmentation, such that
• that subjectivity is relevant only in the absence of science and knowledge; chapter [7]
science and knowledge can de used instead of subjectivity
• the initial value of that subjective sense of speed is actually quite low (since the query did not enjoy the benefit of proper configuration, and thus
the use of Asynch Pre-Fetch and Large I/O), and therefore the users are in reality comparing 'slow' with 'very slow' on the scale of possible speed;
they have never enjoyed 'fast' and they do not know what they are missing.
Level I
Correcting Level I Fragmentation returns great speed to the DataStructures, due to enabling Asynch Pre-Fetch and Large I/O to their maximum
extents. It allows Sybase to operate at the 'fast' end of the possible speed spectrum. Further, it contains and therefore reduces the extent of Level II
Fragmentation 3.
Level II
Most DBAs are aware of some of the aspects of Level II Fragmentation, and how to correct it. There are some traps for young players, as detailed in
9 ignorance of which will cause de-fragmentation operations to be very transient, to have no persistence. However, without an awareness of
chapter [9],
Level I, the baseline speed is 'slow' and the frequency of de-fragmentation operations is increased.
Level III
This is mainly the consequence of storing unnormalised spreadsheets in a database container, as opposed to storing Normalised Relational tables. One
has to live with the consequences of such actions, and deal with the myriad problems, such as fragmentation of a new order; frequent and offline
maintenance of DataStructures; reduced concurrency (increased contention); increased number of locks; etc.
2. Contrary to most articles on the web, Sybase is quite capable of high speed on very large tables. Archiving history data onto a separate database; the
consequent requirement to modify code (to look in two places for one thing); the maintenace of an archive database; the loss of DRI, are all quite
unnecessary.
3. Software Gems provides a High Performance Sybase Configuration, that ensures the server is operating as the highest levels of performance. We also
provide a complete Device & Segment [re-]configuration, such that Level I issues are eliminated. Both on a fixed price, guaranteed result basis.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 13 of 32
Sybase Data Storage & Fragmentation
6.2 Fragmentation Type
It is convenient when the Type identifies the exact location of the Fragmentation within the Database or DataStructure; other
forms of identifying the Type are confusing. In order to fully understand the three Levels of Fragmentation, and types of Fragmentation within each
Level, let us look at the best and worst scenarios in each Level and Type. Your DataStructures will be either one or the other, there is no 'in-between';
however, after correction operations using an overall plan have commenced, the DataStructures will move into that 'in-between' zone.
4. The DOL Heap (containing the data rows), has no PageChain; all scans must use the OAM method
5. The same Result identified at Level I, modulated to the scope identified by Location/Type (the row).
6. Duplicate rows (Keys) are illegal in Relational Databases.
7. It is a good practice to plan and allocate extra space it the Pages and Extents of the DataStructure that contains the data rows, to allow for interspersed
INSERTs; such planned space is not considered unused. Unused Space is specifically the space consumed that is unplanned or unconscious.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
14 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
7 Determination
This chapter explains how Fragmentation at each Level and Type (explained in the previous chapter), for each type of DataStructure
can be determined accurately, and evaluated. The next three sections provide information specific to each of the three Levels of Fragmentation; the
fourth section identifies issues relating to Partitions.
7.1 Determination I
There are no Sybase facilities for identifying Level I Fragmentation, it requires proprietary code, such as our HelpSpace or PhysicalSpace utility, the
report of which is shown here.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 15 of 32
Sybase Data Storage & Fragmentation
7.2 Determination II Space
First, we will examine the basic space metrics related to Level II Fragmentation of the Logical DataStructures, summarising the
underlying Physical DataStructures (Partitions) to the logical level. For non-patitioned DataStructures, this is all that is required. A simple query from
sysindexes, which identifies each Logical DataStructure, is required 1 2.
Table
1 DataStructure
2
Table Lck Row Fwd Del Struct IndexName Idx_KB Unused Used_% Data_KB Unused Used_% LGIO SPUT DPCR IPCR DRCR
TestBase_APL APL 2,000,010 Clst UC_SecurityId 508 96 81.1 89,020 124 99.86 99.96 93.74 99.99
NC1 U__Name 75,720 38 99.95 98.92 99.64 81.85
TestBase_APL_Heap APL 80,000 Heap 3,660 100 97.27 99.62 93.63 99.87
NC2 U__Name 3,056 28 99.08 99.08 99.69 81.68
TestBase_APL_Loc APL 2,000,000 Clst C__SecurityId 512 100 80.47 88,968 78 99.91 99.99 93.75 100.00
NC1 U__SecurityId 22,048 22 99.9 99.69 99.90 100.00
TestBase_DPL DPL 2,105,177 0 309 Heap 105,768 3,056 97.11 100.00 94.19
NC1 U__SecurityId 51,672 230 99.55 26.02 5.25 90.63
NC2 UP_Name 133,868 40 99.97 30.74 24.91 92.45
TestBase_DRL DRL 100,000 0 0 Heap 4,896 16 99.67 100.00 94.17
NC1 UP_SecurityId 1,326 16 98.79 100.00 100.00 100.00
NC2 U__Name 3,984 30 99.25 99.65 99.88 0.05
Requested For
Statistic Returns
(DataStructure)
1 Unused Space/Index Clustered Index (B-Tree) 3 Unused pages in the B-Tree portion of the CI
Nonclustered Index Unused pages in the NCI
• The RESERVED_PAGES() function returns the number of Pages reserved for the DataStructure. If the partionid is not supplied, all Partitions in
the DataStructure are summarised. Multiplying this value by @@PAGESIZE returns bytes, which can then be divided into kilobytes or megabytes.
• Space for each DataStructure is allocated on an Extent basis (eight Pages); the Extent cannot be used by other DataStructures. Thus it is reserved.
• The value returned is of course, whole Pages.
• The DATA_PAGES() function returns the number of Pages in the DataStructure that contain data. If the partionid is not supplied, all Partitions in
the DataStructure are summarised.
• Subtracting DATA_PAGES() from RESERVED_PAGES() yields unused Pages.
• Dividing them yields the percentage used.
1. For DOL tables, on the physical plane, a Heap DataStructure always exists. Additionally, a separate Placement Index (falsely named "clustered")
DataStructure may exist. Such DataStructures are quite different to the single Clustered Index dataStructure. This is reflected in the catalogue, and is
easily confirmed in any report, such as the example.
2. The information in the example reports, and much more, is provided in our HelpIndex/HelpPartition utilities.
3. The Clustered Index DataStructure has both B-Tree and Data components: the Pages reserved and the Pages used can be obtained for the B-Tree
portion and the Data portion of the Clustered Index, separately.
Intro Unit DataStruct Defn III Determ I AllocUnit I Segment II PageChain II Unused III Page
16 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
7.3 Determination II DerivedStat
Second, we will examine the Derived Statistics provided by Sybase that relate to Level II Fragmentation of the Logical DataStructures,
again summarising the underlying Physical DataStructures (Partitions) to the logical level. A simple query from sysindexes, which identifies each
Logical DataStructure, is required 1 2.
Table DataStructure
3 4 5 6 7
Table Lck Row Fwd Del Struct IndexName Idx_KB Unused Used_% Data_KB Unused Used_% LGIO SPUT DPCR IPCR DRCR
TestBase_APL APL 2,000,010 Clst UC_SecurityId 508 96 81.1 89,020 124 99.86 99.96 93.74 99.99
NC1 U__Name 75,720 38 99.95 98.92 99.64 81.85
TestBase_APL_Heap APL 80,000 Heap 3,660 100 97.27 99.62 93.63 99.87
NC2 U__Name 3,056 28 99.08 99.08 99.69 81.68
TestBase_APL_Loc APL 2,000,000 Clst C__SecurityId 512 100 80.47 88,968 78 99.91 99.99 93.75 100.00
NC1 U__SecurityId 22,048 22 99.9 99.69 99.90 100.00
TestBase_DPL DPL 2,105,177 0 309 Heap 105,768 3,056 97.11 100.00 94.19
NC1 U__SecurityId 51,672 230 99.55 26.02 5.25 90.63
NC2 UP_Name 133,868 40 99.97 30.74 24.91 92.45
TestBase_DRL DRL 100,000 0 0 Heap 4,896 16 99.67 100.00 94.17
NC1 UP_SecurityId 1,326 16 98.79 100.00 100.00 100.00
NC2 U__Name 3,984 30 99.25 99.65 99.88 0.05
Returns
Requested For
Statistic
(DataStructure)
Meaningless & Confusing 4
4 SPUT Data Space Utilisation Heap Density of data rows per data page
Clustered Index Density of data rows per data page
Nonclustered Index 5 Does not apply
5 DPCR Data Page Cluster Ratio Heap/APL Density of data per page in the Heap, via PageChain
Heap/DOL 6 Does not apply
Clustered Index Density of data per page in CI order
Nonclustered Index 7 Does not apply
6 IPCR Index Page Cluster Ratio Heap 8 Does not apply
Clustered Index 9 Does not apply
Nonclustered Index Density of index pages in NCI order
4. Display of meaningless figures causes great confusion, and invites comparison with meaningful figures, eg. DPCR for a DOL Heap (fixed 100%,
meaningless) cannot be related to or be compared with DPCR for an APL Heap (meaningful) which can be addressed, in order to achieve close to
100%. Administrative time is wasted in correlating such figures and trying to make sense of them; decisions that may be made on the basis of such
confusion are consequently irrelevant and meaningless. It is therefore better to avoid displaying meaningless figures, and to focus on the meaningful
figures alone.
5. Data Space Utilisation Data is contained in either the Heap or the Clustered Index only, therefore SPUT applies to them alone, the fiigure for the
NCI (always 0%) is meaningless.
6. Data Page Cluster Ratio The DOL Heap does not have a PageChain; data page access is via the OAM only; the figure (always 100%) is
meaningless (space may well be poorly utilised); use LGIO or SPUT instead. It is not comparable with the DPCR of the APL Heap or CI.
7. DPCR is relevant for fetching data pages, which reside in the Heap or the Clustered Index only. It does not apply to the Nonclustered Index, since it is
used to access data rows; data pages are never fetched via that structure. The Nonclustered Index (including the Placement Index) does not support
Range Queries, only the Clustered Index does, and there it does fetch pages.
8. Index Page Cluster Ratio is relevant for fetching index pages; it applies to the Nonclustered Index. There are no index pages in the Heap; the figure
(always 0%) is meaningless; refer to IPCR of the relevant NCI.
9. Index pages in the Clustered Index are not provided separately; the figure (always 0%) is meaningless; use DPCR instead.
10. Data Row Cluster Ratio is relevant for fetching data rows; it applies to the Nonclustered Index, since it is used to fetch data rows. It does not apply
to the Heap since access to it is for pages, via the PageChain (APL) or the OAM (DOL). The figure (always 100%) is meaningless: for APL, use
DPCR instead; otherwise, refer to DRCR of the relevant Nonclustered Index.
11. DRCR does not apply to the Clustered Index. Since the data rows in the Clustered Index are maintained in index order, by definition the DRCR is
100%. The figure is meaningless: for APL, use DPCR instead; for DOL, there is no Clustered Index, refer to DRCR of the relevant Nonclustered Index.
12. The function does not provide statistics for the Text/Image chain.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 17 of 32
Sybase Data Storage & Fragmentation
7.4 Determination III
Third, we will examine the Forwarded and Deleted row counts that relate to Level III Fragmentation of the Logical DataStructures,
which occur in DPL/DRL lockschemes only. This applies to the Heap, and is in addition to, not instead of, LGIO and SPUT (which are explained in
[7.3]
7.3 ). Again summarising the underlying Physical DataStructures (Partitions) to the logical level. A simple query from sysindexes, which identifies
each Logical DataStructure, and systabstats.forwrowcnt & delrowcnt is required 1 2.
8 9 3 4
Table DataStructure
Table Lck Row Fwd Del Struct IndexName Idx_KB Unused Used_% Data_KB Unused Used_% LGIO SPUT DPCR IPCR DRCR
TestBase_APL APL 2,000,010 Clst UC_SecurityId 508 96 81.1 89,020 124 99.86 99.96 93.74 99.99
NC1 U__Name 75,720 38 99.95 98.92 99.64 81.85
TestBase_APL_Heap APL 80,000 Heap 3,660 100 97.27 99.62 93.63 99.87
NC2 U__Name 3,056 28 99.08 99.08 99.69 81.68
TestBase_APL_Loc APL 2,000,000 Clst C__SecurityId 512 100 80.47 88,968 78 99.91 99.99 93.75 100.00
NC1 U__SecurityId 22,048 22 99.9 99.69 99.90 100.00
TestBase_DPL DPL 2,105,177 0 309 Heap 105,768 3,056 97.11 100.00 94.19
NC1 U__SecurityId 51,672 230 99.55 26.02 5.25 90.63
NC2 UP_Name 133,868 40 99.97 30.74 24.91 92.45
TestBase_DRL DRL 100,000 0 0 Heap 4,896 16 99.67 100.00 94.17
NC1 UP_SecurityId 1,326 16 98.79 100.00 100.00 100.00
NC2 U__Name 3,984 30 99.25 99.65 99.88 0.05
Requested For
Statistic Returns
(DataStructure)
8 Forward 13 DOL Heap Variable length rows that have been transferred to another location
14 DOL Heap Rows that are marked for deletion
9 Delete
• systabstats contains one row for each Physical DataStructure, which means the columns must be summed to produce a Logical level report.
• Execute sp_flushstats before querying the table.
• Forwards and Deletes apply to the DOL Heap only.
• DOL tables always have a Heap, wherein the row resides. The Heap is Static RowId based. The space allocated for Forwarded rows (which
consume the space of two rows) and Deleted rows (which consumes the space of one row), cannot be re-used for interspersed INSERTS.
• Since Forwards and Deletes do not apply to APL tables (row expansion is performed in-place and deletion is immediate), the relevant cells are
empty in the example report.
• Space can be reclaimed via REORG or DROP/CREATE "CLUSTERED" INDEX (there is no Clustered index for DOL tables, but the syntax is
required).
7.5 Evaluation
a. The three sets of metrics (Unused Space; Derived Statistics; Forwards & Deletes) regarding Fragmentation of a DataStructure must be taken
together; any single metric should not be evaluated alone.
b. Similarly, all the DataStructures that belong to a table should be evaluated together. This should be done in the context of the actual usage: certain
queries require single-row data (via an index); covered queries require access across an entire index; yet others would require table scans.
Knowledge of how the data is accessed, and the DataStructures that are used to support that access, is essential to relevant administration.
c. In addition, the actual speed of the DataStructures belonging to the relevant tables must be monitored: timing records (for either a controlled test or
an actual production sample at certain times of day, ensuring the same configuration and cache settings) must be kept, so that they can be compared
before and after de-fragmentation operations.
• The value of any particular de-fragmentation operation must be confirmed: there is no point in performing operations that do no provide a benefit.
• The length of time between de-fragmentation operations, when speed is regained, and the point where the DataStructure has deteriorated enough
to warrant the operation being repeated, should be recorded. If Level I Fragmentation is addressed, the frequency of such operations is
substantially reduced.
d. Likewise, sysmon reports covering the period of the day should be maintained, or MDA data should be captured at relevant intervals. This is very
important because it will allow you to tune the structures at an overall level (rather than on a DataStructure basis).
• The most important indicator of Fragmentation is that the Asynchronous Pre-Fetch capability that is built into the server, and the Large I/O
resources that have been configured, are not used. Denying these facilities cripples the speed of Sybase.
13. For each Forwarded row, two row 'slots' are consumed: the first for the original location, the address of which is fixed, and cannot be moved; and the
second for the forwarded location, which contains the expanded data row.
14. Deletes are not physically removed from DOL Heaps until REORG is executed.
Intro Unit DataStruct Defn II Determ I AllocUnit I Segment II PageChain II Unused III Page
18 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
7.6 Determination Partition
The above reports view the Logical DataStructures, and that is quite adequate for initial inspection, before further inspection is
warranted. It is the end point for non-partitioned DataStructures. For Partitioned DataStructures 15, the Physical DataStructure must be inspected. The
determination of Level II & III Fragmentation is only slightly more complex, it requires a simple query from syspartitions, which identifies Physical
DataStructures, and systabstats.forwrowcnt & delrowcnt 1 2.
Table DataStructure
8 9 1 Partition
2 3 4 5 6 7
Table Lck Struct IndexName Partition Row Fwd Del Idx_KB Unused Used_% Data_KB Unused Used_% LGIO SPUT DPCR IPCR DRCR
TestBase_APL APL Clst UC_SecurityId [1] 496,821 128 24 81.25 22,126 42 99.81 99.94 93.74 99.98
[2] 496,195 126 24 80.95 22,080 26 99.88 100.00 93.75 100.00
[3] 496,091 128 26 79.69 22,080 30 99.86 100.00 93.74 100.00
[4] 510,903 126 22 82.54 22,734 26 99.89 100.00 93.75 100.00
NC 1 U__Name 75,720 38 99.95 98.92 99.64 81.85
TestBase_APL_Heap APL Heap [1] 20,891 960 30 96.88 100.00 93.60 100.00
[2] 20,245 928 28 96.98 100.00 93.73 100.00
[3] 20,007 910 20 97.80 100.00 93.67 100.00
[4] 18,857 862 22 97.45 100.00 93.54 100.00
NC 2 U__Name 3,056 28 99.08 99.08 99.69 81.68
TestBase_APL_Loc APL Clst C__SecurityId data_1 494,325 128 26 79.69 21,998 28 99.87 100.00 93.75 100.00
data_2 493,200 128 26 79.69 21,934 14 99.94 100.00 93.75 100.00
data_3 493,920 128 26 79.69 21,966 14 99.94 100.00 93.75 100.00
data_4 518,555 128 22 82.81 23,070 22 99.90 100.00 93.75 100.00
NC 1 U__SecurityId 22,048 22 99.90 99.69 99.90 100.00
TestBase_DPL DPL Heap [1] 571,980 0 94 28,748 840 97.08 100.00 94.18
[2] 494,252 0 49 24,998 884 96.46 100.00 94.19
[3] 508,744 0 90 25,540 718 97.19 100.00 94.19
[4] 530,201 0 76 26,482 614 97.68 100.00 94.19
NC 1 U__SecurityId 51,672 230 99.55 26.02 5.25 90.63
NC 2 UP_Name 133,868 40 99.97 30.74 24.91 92.45
• The columns have been re-arranged to clarify the DataStructure hierarchy and to make sense. The various row counts, space usage, and derived
statistics are shown at the Partition (Physical) level, where it is actually located.
• The Heap and the Text/Image Chain are not named. Where the Partition is not explicitly named, an ordinal number is used to identify it (rather than the
default Partition name, which is made up from the long and unusable partitionid).
• This example report lists Partitioned tables. It shows all DataStructures relating to each Partitioned table, in one place, in order to avoid having to
examine two reports.
• TestBase_DRL is not Partitioned, thus it is absent from this report.
Returns
Requested For
Statistic
(Partition)
Meaningless & Confusing 4
1 Unused Space/Index Clustered Index (B-Tree) Unused pages in the B-Tree portion of the CI
4 SPUT Data Space Utilisation Heap Density of data rows per data page
Clustered Index Density of data rows per data page
5 DPCR Data Page Cluster Ratio Heap/APL Density of data per page in the Heap, via PageChain
Heap/DOL 6 Does not apply
Clustered Index Density of data per page in CI order
• syspartitions and systabstats each contains one row for each Physical DataStructure (Partition).
• Execute sp_flushstats before querying the tables.
• Only the DataStructure that holds data rows, either the Heap or the Clustered Index, is Partitioned; the Nonclustered Index and the Text/Image
Chain are not Partitioned.
15. Partitioning (if implemented correctly at all resource levels) provides massively increased performance, improved concurrency (if OLTP Standards
are implemented), and substantially reduces maintenance and de-fragmentation windows, because Partitions can be administered individually, or a
needs basis.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 19 of 32
Sybase Data Storage & Fragmentation
8 I Allocation Unit
This part of the document identifies Level I Fragmentation: AllocationUnits within the Database (Allocations) and Extents within
AllocationUnits. It is provided in three sections:
• AllocationUnit basics
• Why Drop/Create does not return Asynch Pre-Fetch and Large I/O, and
• Prevention of Level I fragmentation, the use of Segments.
8.1 Fresh
AllocUnit 32 Extents, 256 Pages, 512KB
Extents A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
This shows the result of loading a single DataStructure into an empty AllocationUnit, and creating the Clustered Index, with SORTED_DATA if the CI was
just droppped. The Extents are contiguous within the AllocationUnit; Asynch Pre-Fetch and Large I/O are fully operational. Even if the order was not
sequential, and the PageChain was not linear, these facilities remain fully operational; the Look-Ahead set is not scaled down.
8.2 Fragmented
Extents A
Where Segments are not understood and used, as in most sites, the reality is somewhat different. Since the Extents of up to 32 DataStructures (physical
objects) can be located in an AllocationUnit, and all tables were loaded by concurrent INSERTS, the AllocationUnits each end up with Extents belonging
to 32 different DataStructures. The Extents are fragmented within the AllocationUnits, and the AllocationUnits are fragmented across all Devices.
• Where 128 DataStructures are loaded, they are all fragmented across four AllocationUnits, etc.
• The INSERTS to all tables contend for the few currently active AllocationUnits, creating an AllocationUnit Hotspot.
• Further, the INSERTS to each table contend with its own Nonclustered indices: if a nominal table has 3 Nonclustered indices, that would be 32 tables
with their NCIs, resulting in 128 DataStructures, across four AllocationUnits.
ASE correctly identifies that Asynch Pre-Fetch & Large I/O (multiple Extents, up to an entire AllocationUnit, at Level I) is not worth attempting. In such
circumstances, drop/create Clustered Index, while de-fragmenting the DataStructure within itself (Levels II & III), does nothing to improve the
established fragmentation at the AllocationUnit level (I): once it is set, it is set for life (refer next page), until Segments are used along with fresh
Allocation Units.
An Object (physical term, as in ObjectAllocationMap; and • The web is full of mis-information, and shallow information.
which is unfortunately different to OBJECT_ID(), etc., • Single-vendor sites are censored, and exclude robust discussion of technical
which is a logical term) is a DataStructure, one of: issues related to their offerings; they have their commercial agenda.
• Clustered Index (APL Only) • There is no substitute for actual experience, or for diligently verifying that you
• Heap (DOL: always, APL: only when there is no CI) have actually accomplished what you set out to.
• the DOL Heap and APL Heap are very different • Fragmentation at every level shown here, is easy to identify.
• Nonclustered Index (DOL Placement Index is NCI) • The success, and ease of correction, depends on your skills and understanding
• Text/Image Chain of this information: this is published free to assist you in that regard.
Intro Unit DataStruct Defn II Determ III Determ I Segment II PageChain II Unused III Page
20 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
9 I Drop-Create
9.1 Common De-Fragmentation Issue
This chapter discusses some of the issues relevant to typical de-fragmentation exercises, and the limitations of DROP/CREATE CLUSTERED INDEX.
Many DBAs de-fragment their DataStructures by performing the full complement of the three steps identified here, and puzzled: while the table is
significantly faster, Asynch Pre-Fetch and Large I/O are not reurned. The DataStructure concerned is either a Clustered Index or a DOL Heap, the before
image is illustrated in [ 8.3 ].
The data has been bcped-out, and the table has been truncated, or the table is dropped and recreated. As long as Segments are not used to place the
table on different Devices, or separate groups of tables, this sequence applies.
When the data is bcped-in, it is placed in the available Extents, most likely the recently evacuated ones (assuming unload/load is performed when the
database in not in use). Certainly, the DataStructure is de-fragmented within its own Extents and Pages (Levels II & III). However, if proceeding with
one or a few DataStructures at at time; the Extents de-allocated will be re-used; they were fragmented at Level I before; and they remain so. Asynch Pre-
Fetch & Large I/O (multiple Extents, up to an entire AllocationUnit, at Level I) is still not possible. Although advised by many Sybase identities, this is a
common mistake; at any rate, its effect is temporary, and it needs to be repeated.
If SORTED_DATA is used, which does not re-write the data Pages, the Extents remain in their location.
AU5120 A 1 2 3 4 5 6
The distilled requirement, is simply to create the Clustered Index without the SORTED_DATA option; this re-writes the data Pages to a new location.
Which makes the bcp-out/bcp-in unnecessary. However, the original DataStructure space, which is released at the end of the process, will be used for
whichever Clustered Index is created next, as shown in section [ 9.6 ].
bcp-out/bcp-in is effective only when the entire database, or at least a large groups of tables, are de-fragmented together. Otherwise, aa new location
can be specified by creating a new Device and identifying a new Segment on it.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 21 of 32
Sybase Data Storage & Fragmentation
9 I Drop-Create
9.5 Drop, Create Placement Index
AU5120 A P 1 2 3 4 5 6 7 8 P 9 10 P ObjectAllocMap
▶AU5120 ▶A ▶Extent
▶Extent
For DOL tables containing more than a few Extents, even immediately following a careful de-fragmentation
H ObjectAllocMap
exercise (DROP/CREATE "CLUSTERED" INDEX in fresh AllocationUnits), although the Heap is initially ▶AU5120 ▶A ▶Extent
contiguous, since the Heap and Placement Index are two separate DataStructures, except for the first few ▶Extent
Pages, the index and data Pages are substantially removed from each other. ▶Extent
...
AU5120 A
The next Clustered Index created takes up the fragmented Extents which were vacated by the previous Clustered Index (green) when it was re-written to
a new location.
There really is no substitute for Segments.
DPL/DRL Lockscheme
• For DOL tables, once the Pages and Extents in the Heap are reasonably
full, unless space is reserved for interspersed INSERTs and row
expansion, it is not possible for rows to be placed "near" each other (as
intended by the Placement Index); logically sequential rows or Pages
could be hundreds of megabytes apart.
• Further, the index Pages in Placement Index and the related data Pages
in the Heap could be hundreds of megabytes apart (while remaining "on
the same Segment", default or otherwise).
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
22 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
10 I Segment
10.1 Normal Growth
Refer to section [ 2.1 ] for introduction to Segments; this chapter discusses the value of Segments in reducing or eliminating Fragmentation.
The use of Segments allows groups of tables to be stored together, and thus separated from competing table groups, on discrete Devices. This shows the
AllocationUnits of:
• 6 Segments Data1 through Data6 (table groups, base colours) used for the Clustered Indices of 18 tables (distinct shades)
• for the purpose of explanation, the Devices may well be named Data1 through Data6 as well
• 2 Segments NC1 and NC2, for all their Nonclustered Indices (an arbitrary 3 Nonclustered Indices A, B, C, per table is shown).
ObjectAllocMap (CI)
Data1 AU0 A 1 1 1 2 3 2 2 4 5 3 3 6 7 4 4 8 9 5 5 10 11 6 6 12 13 7 7 14 15 8 8 O O O ▶AU0
NC1 AU1792 A A B C A B C A B C A B C A B C A B C A B C A B C A B C
NC2 AU2048 A A B C A B C A B C A B C A B C A B C A B C A B C A B C
Where Segments are not used, all data is placed in the default Segment. Since all Objects are loaded via concurrent INSERTS, the Extents
are fragmented within the AllocationUnits, and the AllocationUnits are fragmented across all Devices. That case, unfortunately quite
common, is illustrated in sections [8] and [9 ]. The illustration abovee shows exactly the same quantity of DataStructures and Extents that are
shown in those sections, the numbers continue to identify Extent number within the DataStructure. The above illustrates the result of all
tables being evenly, and concurrently, INSERTED into.
The use of Segments provide three major advantages:
1. Reduction of fragmentation, due to more Extents belonging to fewer DataStructures being placed on each Allocation Unit
• thus Level I de-fragmentation operations are reduced, if not eliminated.
• Asynch Pre-Fetch & Large I/O (multiple Extents, up to an entire AllocationUnit, at Level I) is now reasonably possible, it is worthy
of consideration to the Optimiser.
2. Substantially increased performance, due to:
• enhanced concurrent INSERT speed, for several reasons, primarily because the:
• the tables required in each transaction are separated from each other, on separate Segments, and
• Nonclustered Indices are separated from their data (Clustered Index or DOL Heap), on separate Segments
• onto many Device queues.
3. The absence of Segments results in a few current Allocation Unit Hotspots, on one (the current) Device, despite many Devices being
available. Such hotspots are eliminated.
NC1 AU1792 A A B C A B C A B C A B C A B C A B C A B C A B C A B C
NC2 AU2048 A A B C A B C A B C A B C A B C A B C A B C A B C A B C
This shows the same group, eventually fragmented at Level I under interspersed INSERT/DELETE activity (UPDATE only causes Row migration or Page
splits when the columns are variable), which would cause PageSplits, etc; the resulting fragmentation is depicted. Where even simple Segment plans are
used, fragmentation can be substantially reduced; where carefully considered Segment Plans are used, Level I de-fragmentation operations can be avoided
altogether. Even though fragmented, Asynch Pre-Fetch & Large I/O are fully enabled (although slightly less efficient than when not fragmented).
Note also that since the AllocationUnits are laid out initially as per [10.1], the structures are essentially immune to becoming fragmented. Therefore what
is shown here is the result of extreme interspersed INSERT/DELETE activity, and over a long period.
The effect of de-fragmenting single tables (ie. at the DataStructure level, as and when required, via DROP/CREATE CLUSTERED INDEX, to correct Level
II fragmentation as illustrated above, without requiring unload/reload, produces [10.1] for the subject DataStructure. Since each new DataStructure takes
up the Extents of the previous DataStructure, and that latter was unfragmented for the most part; the sequence of Extents is corrected. However, that is
not the completely contiguous, as shown next.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 23 of 32
Sybase Data Storage & Fragmentation
10 I Segment
10.3 Fresh ObjectAllocMap (CI)
Data1 AU0 A 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 O O O ▶AU0
NC1 AU1792 A A B C A B C A B C A B C A B C A B C A B C A B C A B C
NC2 AU2048 A A B C A B C A B C A B C A B C A B C A B C A B C A B C
The effect of de-fragmenting most or all the tables in each Segment is illustrated here. Of course, Each Segment can be de-fragmented as and when
necessary; all Segments do not need to be de-fragmented at the same time. Where Segments are not used, none of this is possible.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
24 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
12 II Page Chain
This part of the document identifies Level II Fragmentation: Pages within Extents, and shows the effect for the different
LockSchemes. There are four aspects to this level, presented in seven sections:
• PageChain Fragmentation
• Overflow Pages
• Unused Space (Pages) per Extent, and
• Unused Space per Page.
12.1 Fresh
Clustered Index Heap & Placement Index
Extent 8 Pages Extent 8 Pages
ObjectAllocMap ObjectAllocMap
E512 1 2 3 4 5 6 7 8 C ▶AU512 ▶A ▶Extent E768 1 2 3 4 5 6 7 8 H ▶AU768 ▶A ▶Extent
▶Extent ▶Extent
E520 9 10 11 12 13 14 15 16 ▶Extent E60 9 10 11 12 13 14 15 16 ▶Extent
This illustrates an unfragmented Clustered Index Leaf level PageChain, This shows an unfragmented DOL Heap, the data; it is contiguous because
containing index Leaf plus data. It is contiguous, fresh after loading via it has been freshly re-ordered via DROP/CREATE CLUSTERED INDEX. It
bcp or DROP/CREATE CLUSTERED INDEX. also shows the unfragmented Placement Index.
• Asynch Pre-Fetch & Large I/O (multiple Extents, up to an entire • Although the syntax demands "clustered", it is false; the index is in fact
AllocUnit, at Level II, and multiple Pages) are fully enabled. a Placement Index, which is a Nonclustered Index with two additional
criteria (the data is not clustered with the index); the illustration shows
what initial placement does.
E528 22 7 23 12 24 15 18 21 E784 22 7 23 12 24 15 18 21
• This shows a disturbed PageChain, caused by Page Splits, when full • The Heap is fragmented due to DML activity, and no space being
pages need to be split due to interspersed INSERTS, and no space available in the Page, standard fare for monotonically increasing indices.
being available on the Page. • The sequence is not real, since Pages are not accessed in sequence; it
• This shows Pages out of sequence while remaining in the same merely provides a camparison to that on the left (the real sequence is
AllocationUnit; the I/O penalty is more severe when the out-of- much worse)
sequence Pages are located in other AllocationUnits, as per [ 8.3 ]. • To some extent that does not matter, because there is no PageChain and
Range Queries are not supported. However, the overall access to the
table is slowed, and scans must use the OAM method.
12.3 Effect/Range Query & Table Scan
1 2 • 3 • 4 5 6 • 7 • 8 9 10 11 • 12
13 14 • 15 • 16 17 • 18 • 19 20 • 21 • 22 • 23 • 24
• This shows the sequence in which the Pages must be fetched when • Range Queries are based on a Clustered Index (index Leaf plus data),
traversing the PageChain, eg. for Range Queries and Table Scans, Relational or compound Keys, and require a PageChain; since DOL
and highlights the interrupts involved in the traversal tables cannot have a Clustered Index, the feature is not possible for
• Asynch Pre-Fetch & Large I/O (multiple Extents, up to an entire them.
AllocUnit, at Level II) are prevented. Multiple Pages are hindered. • Traversing the Heap, eg. Table Scans, requires navigation via the
• When traversing the PageChain, 15 reads are required instead of 3. ObjectAllocationMap; to the Allocation Page; to the Extent; to the Page.
• On a busy server, that could be up to 14 interrupts, or context That is much slower than retrieval via a PageChain (or comparable to a
switches, which are to be avoided heavily fragmented PageChain)
• PageChains that are fragmented across AllocationUnits require
more of those to be read, and even more I/O
• If the Pages are aged out of the cache during this time, they must
be read again, etc. (Not illustrated.)
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 25 of 32
Sybase Data Storage & Fragmentation
12 II Page Chain
AllPage Locked DataPage/DataRow Locked
This illustrates an unfragmented Nonclustered Index Leaf level PageChain, containing index Leaf entries. It is contiguous, fresh after DROP/CREATE
NONCLUSTERED INDEX (or "clustered" if it is a Placement Index)
1 2 • 3 • 4 5 6 • 7 • 8 9 10 11 • 12 1 2 • 3 • 4 5 6 • 7 • 8 9 10 11 • 12
13 14 • 15 • 16 17 • 18 • 19 20 • 21 • 22 • 23 • 24 13 14 • 15 • 16 17 • 18 • 19 20 • 21 • 22 • 23 • 24
This illustrates the effect of fragmentation on the PageChain of a Nonclustered index (including PI). It shows the sequence in which the Pages must be
fetched when traversing the PageChain, and highlights the interrupts involved in the traversal
• Asynch Pre-Fetch & Large I/O (multiple Extents, up to an entire AllocUnit, at Level II) are prevented. Multiple Pages are hindered.
• When traversing the PageChain, 15 reads are required instead of 3.
• On a busy server, that could be up to 14 interrupts, or context switches, which are to be avoided
• PageChains that are fragmented across AllocationUnits require more of those to be read, and even more I/O
• If the Pages are aged out of the cache, they must be read again, etc. (Not illustrated.)
Focus
In order to avoid confusion, and to maintain focus, other Levels of
fragmentation are excluded from this Level II discussion. Page level
issues such as the space usage consequences relating to DOL tables are
discussed in Level III Fragmentation. Unused Space within Extents is
discussed in [ 14], Unused Space within Pages is discussed in [ 15 ].
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
26 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
13 II Overflow Page
AllPage Locked DataPage/DataRow Locked
13 Overflow Page
Clustered Index/Duplicate Row Heap & Placement Index/Forward
indid = 1 indid = 4 indid = 0
C ObjectAllocMap P ObjectAllocMap H ObjectAllocMap
▶AU512 ▶A ▶Ext ▶AU1280 ▶A ▶Ext ▶AU768 ▶A ▶Ext
Row Forwarded;
Original RowId
Placement
Clust Unchanged
Index Additional Read
Heap on Every Access
NCI
Additional
Page per
Dupe Key
Forwards at
End of Heap
Overflow pages occur only for a Clustered Index that is non-unique. DOL DataStructures do not have Overflow Pages in the sense that Sybase
For each CI key that is duplicated, an Overflow Page is required, which has not given it a name. However the concept of Forwarded Rows is
contains a chain of duplicate rows, the single original row remaining in identical, and far more frequent (row expansion vs row duplication),
the contiguous CI DataStructure. although the overhead is greater. A technically accurate name, in the
The Clustered Index
Index DataStructure is not designed to allow duplicate context of existing, established names, is Overflow Pages, albeit for
keys. Forwards rather than for Duplicates.
• By definition, in a Relational Database, every row must be unique; A further difference is that the Forwarded row consumes the space of two
APL tables are highly suited to that purpose; and thus it is not an rows, since the original location cannot be used; whereas the APL duplicate
issue in Relational tables consumes one row.
• Record filing systems with IDENTITY or surrogate keys should use Since the NonCLustered Index(including Placement Index) and the Heap
Nonclustered Index
DOL tables, and thus it is again not an issue. are physically separate DataStructures, and row order is not maintained,
• In any case, every CI should be unique; a non-unique CI should be duplicate rows are not an issue: the management of duplicate keys can be
viewed as a serious error, not merely as additional I/O. handled within the index B-tree structure. For such indices, there is one
• For 'queue' or 'pipe' or log tables, a Heap without a CI is best. Where Leaf entry (RowId) for each key, whether duplicated or not; the duplicate
a CI has been chosen (eliminating a Heap), ensure that the CI is rows are merely two Index Leaf entries; two different RowIds.
unique.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 27 of 32
Sybase Data Storage & Fragmentation
14 II Unused Space Extent
For all DataStructures, a few empty slots in each Page (via FILLFACTOR) and a few empty Pages in each Extent (via
RESERVEPAGEGAP) is desirable, to allow for interspersed INSERTS. However, where there are more interspersed DELETES than interspersed INSERTS,
this may be more than is desired. Where there are no interspersed INSERTS, unused space is not required.
The issue relevant to unused space is, whether it was planned or not; and only the latter is a problem. Let us consider unused space that is unplanned.
Here the DataStructure that contains the data rows (Clustered Index for APL or Heap for DOL) is most relevant, and detailed below. Nonclustered
Indices do get fragmented (in the category of unused space), when there are bulk DELETES that are interspersed. However, this is easy and fast to correct
(drop and create the index). In any case, Nonclustered Indices are affected more by disturbed PageChains, than by unused Extents or Pages.
14 Unused Space/Extent
Clustered Index Heap & Placement Index
Extent 8 Pages Extent 8 Pages
ObjectAllocMap ObjectAllocMap
E512 1 2 3 4 5 6 7 C ▶AU512 ▶A ▶Extent E768 1 2 3 4 5 6 7 H ▶AU768 ▶A ▶Extent
▶Extent ▶Extent
E520 8 9 10 11 12 13 ▶Extent E776 8 9 10 11 12 13 ▶Extent
E1288 8 9 10 11 12 13
Both CI and NCI are shown here, obviously the effect on data Pages, Both the Heap and Placement Index are shown here, obviously the effect
and the correction thereof, is much more serious. The NCI is easy and on data Pages, and the correction thereof, is much more serious.
fast to correct. Correcting the Heap constitutes a demand to drop and create the Placement
Index (unfortunately addressed via the "clustered" syntax), since the PI
defines initial placement of rows in the Heap.
14.1 Effect
• Asynch Pre-Fetch & Large I/O (multiple Pages, up to an entire • Asynch Pre-Fetch & Large I/O (multiple Pages, up to an entire
Extent, at Level II), where Extents are requested, is not hindered. Extent, at Level II), where Extents are requested, is somewhat
The self-modulating Look-Ahead Set is simply scaled down a little, hindered. The self-modulating Look-Ahead Set is scaled down a
unless the ratio of empty Pages is large. little more than in APL.
• This applies when traversing the Clustered Index, eg. for Range • This applies when traversing the relevant Nonclustered Index, eg. for
Queries, Covered Queries and Table Scans, and traversing the Covered Queries.
Nonclustered Index for Covered Queries. • Range Queries are not supported for DOL tables.
• Table Scans use the OAMPage access method.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain III Page
28 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
15 II Unused Space Page
AllPage Locked DataPage/DataRow Locked
15 Unused Space/Page
This illustrates the result of heavy interspersed INSERT/DELETES at the Page level for the Lock Schemes, the rows in the Pages in the same pair of
Extents in [12]
14 above are shown.
E512 1 2 3 4 5 6 7 E768 1 2 3 4 5 6 7
E520 8 9 10 11 12 13 E776 8 9 10 11 12 13
Deleted,
E512 E768
Expanded
E520 E776
E792 Forwards
The Page is kept trim: rows are shifted upon deletion and row Note that even at this level, the forwarded rows (red); forwards (dark
expansion/contraction. pink); and deleted rows (dark grey) are visible, separate from unused space
(light grey). The additional space requirement is obvious. (In order to
avoid confusion, Level III Fragmentation is excluded from this Level II
discussion; it is discussed separately, overleaf.)
15.1 Effect
• Asynch Pre-Fetch & Large I/O (multiple Pages, up to an entire • Asynch Pre-Fetch & Large I/O (multiple Pages, up to an entire Extent, at
Extent, at Level II), where Extents are requested, is not hindered, Level II), where Extents are requested, is somewhat hindered, since the
since the Pages are trimmed. The self-modulating Look-Ahead Set is Pages are not trimmed; DELETED rows are not deleted; and rows are
simply scaled down a little, unless the ratio of Unused Space per page Forwarded. The self-modulating Look-Ahead Set is scaled down a lot
is large. more than in APL.
• This applies when traversing the Clustered Index, eg. for Range • This applies when traversing the relevant Nonclustered Index, eg. for
Queries, Table Scans, and traversing the Nonclustered Index for Covered Queries.
Covered Queries.
16 Level II Summary
To summarise the types of fragmentation covered in Level II:
• PageChains are fragmented across Extents, or worse, across AllocationUnits.
• This prevents Asynch Pre-Fetch & Large I/O (multiple Extents and Pages at Level II).
• Such fragmentation can be greatly reduced at the highest level by implementing Segments, since it limits the physical range of DataStructures.
• It can be reduced at the DataStructure level by reserving space for expected interspersed INSERTS and row expansion. Disk space is cheap.
• Unplanned Unused Space within Extents and within Pages scale down Asynch Pre-Fetch & Large I/O.
• Planned reserved space maintains the speed of the DataStructure. Yes sir, everything in a computer system is a trade-off.
• Level II fragmentation is corrected via DROP/CREATE CLUSTERED INDEX with the appropriate FILLFACTOR.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 29 of 32
Sybase Data Storage & Fragmentation
17 III Page
Level III is a new form of fragmentation (Pages and Rows) that applies to DOL tables only. These pages illustrate the
fragmentation in their DataStructures, as a consequence of normal DML activity, step by step, and compares them with APL. Understanding the different
DataStructures and their relations, is a pre-requisite.
17.2 Clustered Index Next Sequential Insert Heap & Placement Index Next Sequential Insert
• The next (new max) value of a • The next (new max) value of a
monotonic or surrogate Key. monotonic or surrogate Key.
Clust
• Such keys are the worst
Index candidate for a Clustered Index
Heap
NCI
New Page
at End
of Page
Chain New Page at
End of Heap
17.3 Clustered Index Interspersed Insert/Space Heap & Placement Index Interspersed Insert/Space
• A random value of a Relational • A random value of a Relational
(composite) Key, where there is (composite) Key, where there is space
Clust
space on the page. The rows on the page. The rows are not
Index remain ordered and distributed. ordered; it is located "near by"
• Such keys are the best Heap
candidates for a Clustered Index NCI
17.4 Clustered Index Interspersed Insert/No Space Heap & Placement Index Interspersed Insert/No Space
Original Page is Split Page • The page does not need to be full; if
Contiguity of Page Chain
the new row causes existing RowIds to
Chain is disturbed
Clust move, a new Page or Extent is used
Index
p.1 p.1 Heap
p.2 NCI
No PageChain
to disturb
p.2
New Page at
PageChain Fragmentation is Level II, shown here for comparison. End of Heap
In terms of the CI, or logically, the split pages appear next to each
other. Physically, the new page is at the end of the structure.
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused
30 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15
Sybase Data Storage & Fragmentation
17 III Page
AllPage Locked DataPage/DataRow Locked
17.5 Clustered Index Interspersed Delete Heap & Placement Index Interspersed Delete
Rows Shifted; • Note the unused space; it cannot be
Pages are
trimmed used for new rows.
Clust
Deletes Marked
Index
but not Removed;
Heap Pages are not
NCI trimmed
17.6 Clustered Index Interspersed Update (Expand) Heap & Placement Index Interspersed Update (Expand)
Rows Shifted; • Note the unused space; it cannot
Pages are
trimmed be used for new rows. Forwards
Clust consume twice the space.
Index
Row Forwarded;
Heap Original RowId
NCI Placement
Unchanged
Additional Read
on Every Access
Forwards at
End of Heap
17.7 Clustered Index No Page Fragmentation Heap & Placement Index Page Fragmentation
Page P4 45 46 47 48 49 RowIds Page P4 45 46 47 48 49 RowIds
P4 45 47 48 49 46 Deleted P4 45 47 48 49 46 Deleted
P4 45 47 48 49 47 Expanded P4 45 48 49 47 Expanded
Forwarded Rows
Clust Heap
Index
NCI Deleted Rows
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
Derek Asirvadem • V2.5.1 • 12 Sep 15 Copyright © 2012 Software Gems Pty Ltd Sybase Data Storage & Fragmentation • 31 of 32
Sybase Data Storage & Fragmentation
19 Index Type
AllPage Locked DataPage/DataRow Locked
19.1 Heap (When No Clustered Index) Heap (Always)
data_segment data_segment
A • Chronological (INSERT) order A • Chronological (INSERT) order
F F
C C
Z Z
E E
D D
B B
sysindexes.indid = 0 sysindexes.indid = 0
19.2 Heap plus NCI (When No Clustered Index) Heap plus NCI (No Placement Index)
NCI_segment data_segment NCI_segment data_segment
4 R A 4 R A
3 I R F 3 I R F
2 I I R C 2 I I R C
1 I I I R Z 1 I I I R Z
I I I R E I I I R E
I I R D I I R D
I R I R
B B
19.4 Clustered Index plus NCI Heap & Placement Index plus NCI
data_segment NCI_segment data_segment NCI_segment
4 A 4 R 4 R 4 R A 4 R 4 R
3 I B 3 I R 3 I R 3 I R B 3 I R 3 I R
2 I I C 2 I I R 2 I I R 2 I I R C 2 I I R 2 I I R
1 I I I D 1 I I I R 1 I I I R 1 I I I R D 1 I I I R 1 I I I R
I I I E E
I I I R I I I R I I I R I I I R I I I R
I I F F
I I R I I R I I R I I R I I R
I Z I R I R I R Z I R I R
Intro Unit DataStruct Defn II Determ III Determ I AllocUnit I Segment II PageChain II Unused III Page
32 of 32 • Sybase Data Storage & Fragmentation Copyright © 2012 Software Gems Pty Ltd Derek Asirvadem • V2.5.1 • 12 Sep 15