0% found this document useful (0 votes)
70 views

Which Compression Types Exist?: Compression Type M - Cs - Columns - Compression - Type Valid For Details Typical Scenario

The document discusses different types of compression that can be used for column store databases, including dictionary, prefix encoding, run-length encoding, clustered encoding, indirect encoding, and sparse encoding; dictionary compression maps distinct column values to ID numbers and is always used, while the other techniques can be applied to further compress data when column values are repetitive or clustered. Additional compression of consecutive string values in the dictionary is done via delta compression but cannot be configured.

Uploaded by

sai_balaji_8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Which Compression Types Exist?: Compression Type M - Cs - Columns - Compression - Type Valid For Details Typical Scenario

The document discusses different types of compression that can be used for column store databases, including dictionary, prefix encoding, run-length encoding, clustered encoding, indirect encoding, and sparse encoding; dictionary compression maps distinct column values to ID numbers and is always used, while the other techniques can be applied to further compress data when column values are repetitive or clustered. Additional compression of consecutive string values in the dictionary is done via delta compression but cannot be configured.

Uploaded by

sai_balaji_8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Which compression types exist?

The following compression types exist:


Compression M_CS_COLUMNS -> Valid Typical
Details
type COMPRESSION_TYPE for Scenario
The standard column store dictionary approach
already provides a significant space reduction,
because the distinct column values are mapped to
main, value ID numbers which typically require much less
Dictionary DEFAULT generally
delta space in memory.
Dictionary compression is always used. Additionally
any one of the other compression techniques
mentioned below can be in place.
Prefix Identical values at the beginning of the value ID single
PREFIXED main array are stored only once, together with the number predominant
encoding
of occurrences. column value
Run-length Consecutive identical value IDs are replaced with a several frequent
RLE main
encoding single instance of this value ID and its start position. column values
The value ID array is cut into clusters of 1024
Cluster elements. If a cluster contains only occurrences of a several frequent
CLUSTERED main
encoding single value, the cluster is replaced by a single column values
occurrence of that value.
The value ID array is cut into clusters of 1024
elements. If a cluster contains only a few distinct
Indirect several frequent
INDIRECT main value IDs, a cluster specific dictionary is created, so
encoding column values
that each value ID is represented with even fewer
bits.
single
predominant
The most popular value is removed from the value ID
Sparse column value,
SPARSE main array. A bit vector indicates at which positions the
encoding value ID array
value was removed.
not well
clustered

Additionally, there is a compression of consecutive string values in the dictionary done ("delta compression"). It can't be
influenced and so it is not discussed in detail at this point.

Dictionary compression:
Other compression types:

You might also like