MDC Tables
MDC Tables
A High-Level Overview
Zoran Kulina
DB2 CE Kernel Development
© 2009 IBM
Corporation
Multi-Dimensional Clustering
MDC Purpose
MDC Concepts
Block
– MDC version of extent
– Consecutive set of pages on the disk
– The smallest allocation unit of an MDC table
Block index
– Automatically created
– Point to blocks of data rather than individual rows
– Cannot enforce uniqueness
– Cannot be dropped
MDC Concepts
Dimension block index
– One per dimension
– Used to access dimension data
MDC Concepts
Block map
– Maintains usage status information for blocks (extents)
– Facilitates quick lookup of empty blocks in MDC tables
0 1 2 3 4 5 6 7 ... 0
X F U U U F U F ... 1
2 East, 1996
X 3 North, 1996
Reserved
North, 1997 year
F
Free - no bits set
4
5
U In use - data assigned to a cell
6 South, 1999
..
.
Reserved Data stored
5 © 2009 IBM Corporation
Multi-Dimensional Clustering
MDC Concepts
Dimension
– Ordered set of one or more columns (clustering keys) of the table
– Axis along which data is organized in an MDC table
– Example: dimensions for nation, color, and year
1997, 1998,
Canada, Canada,
1997, 1997, yellow
blue
nation Canada, Canada,
dimension yellow yellow
1997, 1998,
Mexico, Mexico,
blue 1997, yellow
1997,
Mexico,
Mexico,
yellow
yellow
colour year
dimension dimension
MDC Concepts
Slice
– Portion of the table that contains all the rows that have a specific
dimension value (e.g. nation = ‘Canada’)
MDC Concepts
Cell
– Portion of the table that contains rows having the same unique set of
dimension values
– Intersection of slices from each dimension (e.g. all records where
year=2002, country='Canada', and color='yellow‘)
1997, 1998,
Canada, Canada,
Mexico,
blue 1997, 1997, yellow Cell for
nation Canada, Canada, (1997, Canada, yellow)
dimension yellow yellow
1997, 1998,
1998,
Mexico, Mexico,
Canada, Each cell contains one
blue yellow
yellow
1997, 1997, or more blocks.
Mexico, Mexico,
yellow yellow
colour year
dimension dimension
8 © 2009 IBM Corporation
Multi-Dimensional Clustering
MDC Syntax
ORGANIZE BY clause in CREATE TABLE
MDC Syntax
DB2_MDC_ROLLOUT registry variable
– 1, TRUE, ON, YES, IMMEDIATE (default)
– 0, FALSE, OFF, NO
– DEFER
MDC Benefits
Improved query performance
– Block indexes are much smaller than row-level indexes
– Data is guaranteed to be clustered
– Prefetching is more efficient with MDC tables
Reduced logging
– Inserts are not logged unless a new block is needed
– Mass deletes (rollouts) of entire cells log less data than regular
deletes
MDC Benefits
Reduced table maintenance
– Clustering maintained automatically
– No need for reorg unless to reclaim space
Disk space
– MDC tables takes more space than equivalent regular tables
Table design
– Poor selection of clustering key may lead to wasted disk space and
no performance gain
References
DB2 V9.7 Documentation
– http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.ad