Chapter 2 Digital Image Compression
Chapter 2 Digital Image Compression
dataset contains some type of redundancy. Digital image compression is a field that
studies methods for reducing the total number of bits required to represent an image.
This can be achieved by eliminating various types of redundancy that exist in the
pixel values. In general, three basic redundancies exist in digital images that follow.
Psycho-visual
Redundancy:
Inter-pixel
Redundancy:
Coding Redundancy: The uncompressed image usually is coded with each pixel by
a fixed length. For example, an image with 256 gray scales is represented by an array
of 8-bit integers. Using some variable length code schemes such as Huffman coding
and arithmetic coding may produce compression.
5
can be classified broadly into lossy or lossless compression. Lossy compression can
achieve a high compression ratio, 50:1 or higher, since it allows some acceptable
degradation. Yet it cannot completely recover the original data. On the other hand,
lossless compression can completely recover the original data but this reduces the
compression ratio to around 2:1.
Generally most lossy compressors (Figure 2.1) are three-step algorithms, each of
which is in accordance with three kinds of redundancy mentioned above.
Compressor
Original
Image
Transform
Quantization
Entropy
Coding
Channel
Restored
Image
Inverse
Transform
Dequantization
Entropy
Decoding
Decompressor
6
redundancy to represent the packed information with as few bits as possible.
The
quantized bits are then efficiently encoded to get more compression from the coding
redundancy.
2.2.1.1 Quantization
Quantization is a many-to-one mapping that replaces a set of values with only one
representative value.
It uses a
reversible and linear transform to decorrelate the original image into a set of
coefficients in transform domain.
sequentially in transform domain.
7
Numerous transforms are used in a variety of applications.
15,16
transform) approximate the energy-packing efficiency of the KLT, and have more
efficient implementation.
systems since the DFT coefficients require twice the storage space of the DCT
coefficients.
square 8*8 pixel blocks and the DCT followed by Huffman or arithmetic coding, is
utilized in the ISO JPEG (joint photographic expert group) draft international
standard for image compression. 17-19 The disadvantage of this scheme is the blocking
(or tiling) artifacts appear at high compression ratios.
Since the adoption of the JPEG standard, the algorithm has been the subject of
considerable research.
mammograms with a bit rate as low as 0.27 bpp (bits per pixel) while retaining
detection ability of pathologies by radiologists. Kostas et. al. 22 used JPEG modified
for use with 12-bit images and custom quantization tables to compress mammograms
and chest radiographs.
8
combined with more powerful quantization and encoding strategies such as embedded
quantization and context-based arithmetic.
advantages over the existing JPEG standard. Performance gains include improved
compression efficiency at low bit rates for large images, while new functionalities
include
multi-resolution
representation,
scalability
and
embedded
bit
stream
architecture, lossy to lossless progression, ROI (region of interest) coding, and a rich
file format.23
requirements and the appearance of ringing artifacts (a periodic pattern due to the
quantization of high frequencies).
Subband coding is one example among full-frame methods. It will produce a number
of sub-images with specific properties such as a smoothed version of the original plus
a set of images with the horizontal, vertical, and diagonal edges that are missing from
the smoothed version according to different frequencies.27-29
Rompelman30 applied
subband coding to compress 12-bit CT images at rates of 0.75 bpp and 0.625 bpp
without significantly affecting diagnostic quality.
Recently, much research has been devoted to the DWT (discrete wavelet transform)
for subband coding of images.
presented in block transform methods and allow easy progressive coding due to its
multiresolution nature.
9
Bramble et. al.
32
hand radiographs at average rates from about 0.75 bpp to 0.1 bpp with no significant
degradation in diagnostic quality involving the detection of pathology characterized
by a lack of sharpness in a bone edge.
transforms the original image to some other format in which the inter-pixel
redundancy is reduced.
coding redundancy.
lossless compressor.
Compressor
Original
Image
Transform
Entropy
Coding
Channel
Restored
Image
Inverse
Transform
Entropy
Decoding
Decompressor
10
Typically, medical images can be compressed losslessly to about 50% of their
original size. Boncelet et. al.34 investigated the use of three entropy coding methods
for lossless compression with an application to digitized radiographs and found that a
bit rate of about 4 to 5 bpp was best. Tavakoli35,
36
37
investigated
similar techniques and found linear prediction and interpolation techniques gave the
best results with similar compression ratios.
Run length coding replaces data by a (length, value) pair, where value is the
repeated value and length is the number of repetitions. This technique is especially
successful in compressing bi-level images since the occurrence of a long run of a
value is rare in ordinary gray-scale images. A solution to this is to decompose the
gray-scale image into bit planes and compress every bit-plane separately.
Efficient
Lossless predictive coding predicts the value of each pixel by using the values of its
neighboring pixels.
11
variation of the lossless predictive coding is the adaptive prediction that splits the
image into blocks and computes the prediction coefficients independently for each
block to achieve high prediction performance.
Huffman coding utilizes a variable length code in which short code words are
assigned to more common values or symbols in the data, and longer code words are
assigned to less frequently occurring values.
dynamic Huffman coding41 are two examples among many variations of Huffmans
technique.
LZ coding replaces repeated substrings in the input data with references to earlier
instances of the strings. It often refers to two different approaches to dictionary-based
compression: the LZ7742 and the LZ7843 . LZ77 utilizes a sliding window to search
for the substrings encountered before and then substitutes them by the (position,
length) pair to point back to the existing substring.
dictionary from the input file and then replaces the substrings by the index in the
dictionary.
is one of the most well known methods, have been developed based on these ideas.
Variations of LZ coding are used in the Unix utilities Compress and Gzip.
12
Arithmetic coding45 represents a message as some finite intervals between 0 and 1 on
the real number line. Basically, it divides the intervals between 0 and 1 into a number
of smaller intervals corresponding to the probabilities of the messages symbols.
Then the first input symbol selects an interval, which is further divided into smaller
intervals. The next input symbol selects one of these intervals, and the procedure is
repeated. As a result, the selected interval narrows with every symbol, and in the end,
any number inside the final interval can be used to represent the message. That is to
say, each bit in the output code refines the precision of the value of the input code in
the interval. A variation of arithmetic coding is the Q-coder46 , developed by IBM in
the late 1980s. Two references are provided for the latest Q-coder variation. 47,
48
37
The errors
between the interpolation values and the real values are stored, along with the initial
low-resolution image.
and the error values can be stored with fewer bits than the original image.
Laplacian Pyramid49 is another multiresolution image compression method developed
by Burt and Adelson. It successively constructs lower resolution versions of the
original image by down sampling so that the number of pixels decreases by a factor
of two at each scale. The differences between successive resolution versions together
with the lowest resolution image are stored and utilized to perfectly reconstruct the
original image. But it cannot achieve a high compression ratio because the number of
data values is increased by 4/3 of the original image size.
13
Some kinds of tree representation could be used to get more compression by
exploiting the tree structure of the multiresolution methods.50
Lossy compression methods result in some loss of quality in the compressed images.
It is a tradeoff between image distortion and the compression ratio. Some distortion
measurements are often used to quantify the quality of the reconstructed image as
well as the compression ratio (the ratio of the size of the original image to the size of
the compressed image).
which are derived from statistical terms, are the RMSE (root mean square error), the
NMSE (normalized mean square error) and the PSNR (peak signal-to-noise ratio).
1
N*M
N 1 M 1
i =0 j =0
N 1 M 1
2
NMSE = [ f (i, j ) f ' (i , j )]
i=0 j=0
N 1 M 1
[ f (i, j )]2
i =0 j =0
PSNR = 20 * log 10 (
255
)
RMSE
where the images have N*M pixels (8 bits per pixel), f (i , j ) represents the original
image,
and
f ' (i , j )
represents
the
reconstructed
image
after
compression-
decompression.
Since the images are for human viewing, it leads to subjective measurements based
on subjective comparisons to tell how good the decoded image looks to a human
viewer.
14
usefulness of the decoded image for a particular task such as clinical diagnosis in
medical images and meteorological prediction in satellite images and so on.
When comparing two lossy coding methods, we may either compare the qualities of
images reconstructed at a constant bit rate, or, equivalently, we may compare the bit
rates used in two constructions with the same quality, if it is accomplishable.
The
A better measurement of compression is the bit rate due to its independence of the
data storage format. A bit rate measures the average number of bits used to represent
each pixel of the image in a compressed form. Bit rates are measured in bpp, where a
lower bit rate corresponds to a greater amount of compression.
2.4 Summary
Digital image compression has been the focus of a large amount of research in recent
years. As a result, data compression methods grow as new algorithms or variations of
the already existing ones are introduced.
15
However, a 3-D medical image set contains an additional type of redundancy, which
is not often addressed by the current compression methods. Several methods51-58 that
utilize dependencies in all three dimensions have been proposed.
methods51,
53, 57
others
52, 54, 55
Some of these
In this proposal, we first introduce a new type of redundancy existing among pixels
values in all three dimensions from a new point of view and its basic characteristics.
Secondly, we propose a novel lossless compression method based on integer wavelet
transforms, embedded zerotree and predictive coding to reduce this special
redundancy to gain more compression.