Digital Image Processing (Image Compression)
Digital Image Processing (Image Compression)
Image Compression
10.
Run-length Encoding, or RLE is a technique used to reduce the size of a repeating string of characters. This repeating string is called a run; typically RLE encodes a run of symbols into two bytes, a count and a symbol. RLE can compress any type of data regardless of its information content, but the content of data to be compressed affects the compression ratio. Compression is normally measured with the compression ratio.
11.
Source encoder performs three operations: 1) Mapper -this transforms the input data into non-visual format. It reduces the interpixel redundancy. 2) Quantizer - It reduces the psycho visual redundancy of the input images. This step is omitted if the system is error free. 3) Symbol encoder- This reduces the coding redundancy .This is the final stage of encoding process.
12.
13.
Source decoder- has two components a) Symbol decoder- This performs inverse operation of symbol encoder. b) Inverse mapping- This performs inverse operation of mapper. Channel decoder-this is omitted if the system is error free.
15.
16.
The channel encoder reduces the impact of the channel noise by inserting redundant bits into the source encoded data. Eg: Hamming code
17.
What is jpeg?
The acronym is expanded as "Joint Photographic Expert Group". It is an international standard in 1992. It perfectly Works with colour and greyscale images, Many applications e.g., satellite, medical,...
18.
jpeg
JPEG is good for photography Compression ratios of 20:1 are easily attained 24-bits per pixel can be used leading to better accuracy Progressive JPEG(interlacing)
jpeg2000
JPEG 2000 is an all encompassing standard Wavelet based image compression standard Lossless and lossy compression Progressive transmission by pixel accuracy and resolution Region-of-Interest Coding Random codestream access and processing Robustness to bit-errors Content-based description Side channel spatial information (transparency)
19. What are the operations performed by error free compression?
1) Devising an alternative representation of the image in which its interpixel redundant are reduced. 2) Coding the representation to eliminate coding redundancy
20.
Huffman coding is a popular technique for removing coding redundancy. When coding the symbols of an information source the Huffman code yields the smallest possible number of code words, code symbols per source symbol.
21.
Image compression refers to the process of redundancy amount of data required to represent the given quantity of information for digital image. The basis of reduction process is removal of redundant data.
(or) A technique used to reduce the volume of information to be transmitted about an image 22. Define encoder.
Source encoder is responsible for removing the coding and interpixel redundancy and psycho visual redundancy. There are two components A) Source Encoder B) Channel Encoder
23.
Variable Length Coding is the simplest approach to error free compression. It reduces only the coding redundancy. It assigns the shortest possible codeword to the most probable gray levels.
24.
In arithmetic coding, one to one corresponds between source symbols and code word doesnt exist where as the single arithmetic code word assigned for a sequence of source symbols. A code word defines an interval of number between 0 and 1.
25.
Twelve mark Questions 1. Explain various functional block of JPEG standard? Joint Photographic Experts Group. International standard for photographs. It is Lossless/lossy. Based on the facts that : Humans are more sensitive to lower spatial frequency components. A large majority of useful image contents change relatively slowly across images.
Steps involved : Image converted to Y,Cb,Cr format Divided into 8x8 blocks Each 8x8 block subject to DCT followed by quantization Zig-zag scan DC coefficients stored using DPCM RLE used for AC coefficients Huffman encoding Frame generation
Block preparation Compute luminance (Y) & chrominance (I & Q) according to the formulas: Y = 0.3R + 0.59G + 0.11B (0 to 255) I = 0.6R - 0.28G - 0.32B (0 to 255) Q = 0.21R - 0.52G + 0.31B (0 to 255) Separate matrices are constructed for Y,I,Q. Square block of four pixels are averaged in the I & Q (lossy and compress image by factor of 2). 128 is subtracted form Y,I and Q. Each matrix is divided up into 8X8 blocks
Quantization
Less important DCT coefficients are wiped out. It is the main lossy step involved in JPEG. It is done by dividing each of the coefficients in the 8X8 matrix by a weight taken from a table. These weights are not a part of JPEG std.
Differential quantization
It reduces the(0,0) value of each block by replacing it with the amount it differs from the corresponding element in the previous block. Since these elements are the average value of their respective blocks ,they should change slowly.
Statistical output encoding JPEG uses Huffman encoding for this purpose. It often produces a 20:1 compression or better. For decoding we have to run the algorithm backward. JPEG is roughly symmetric: Decoding takes as long as encoding. Advantages and Disadvantages:Advantages
Compression ratios of 20:1 are easily attained. 24-bits per pixel can be used leading to better accuracy. Progressive JPEG(interlacing)
Disadvantages
Doesnt support transparency. Doesnt work well with sharp edges. Almost always lossy and No target bit rate
JPEG 2000 STANDARD: Wavelet based image compression standard Encoding Decompose source image into components Decompose image and its components into rectangular tiles Apply wavelet transform on each tile Quantize and collect subbands of coefficients into rectangular arrays of code-blocks Encode so that certain ROIs can be coded in a higher quality Add markers in the bitstream to allow error resilience
Advantages: Lossless and lossy compression. Progressive transmission by pixel accuracy and resolution. Region-of-Interest Coding. Random codestream access and processing. Robustness to bit-errors. Content-based description. Side channel spatial information (transparency).
2. Explain
length
coding
(ii)
two-
RLE stands for Run Length Encoding. It is a lossless algorithm that only offers decent compression ratios in specific types of data.
Pre-processing method, good when one symbol occurs with high probability or when symbols are dependent Count how many repeated symbol occur Source symbol = length of run
(i) one-dimensional run length coding Used for binary images Length of the sequences of ones & zeroes are detected. Assume that each row begins with a white(1) run. Additional compression is achieved by variable lengthcoding (Huffman coding) the run-lengths. An m-bit gray scale image can be converted into m binary images by bit-plane slicing. These individual images are then encoded using run-length coding. However, a small difference in the gray level of adjacent pixels can cause a disruption of the run of zeroes or ones. Example: Let us say one pixel has a gray level of 127 and the next pixel has a gray level of 128. In binary: 127 = 01111111 & 128 = 10000000
Therefore a small change in gray level has decreased the
(ii) two-dimensional run length coding. Developed in 1950s and has become, along with its 2-D extensions, the standard approach in facsimile (FAX) coding. Two dimensional array of pixel values Spatial redundancy and temporal redundancy Human eye is less sensitive to chrominance signal than to luminance signal (U and V can be coarsely coded) Human eye is less sensitive to the higher spatial frequency components Human eye is less sensitive to quantizing distortion at high luminance levels Source image as 2-D matrix of pixel values R, G, B format requires three matrices, one each for R, G, B quantized values In Y, U, V representation, the U and V matrices can be half as small as the Y matrix Source image matrix is divided into blocks of 8X8 submatrices Smaller block size helps DCT computation and individual blocks are sequentially fed to the DCT which transforms each block separately
two pass algorithm First pass accumulates the character frequency and generate codebook. Second pass does compression with the codebook. Huffman codes require an enormous number of computations. For N source symbols, N-2 source reductions (sorting operations) and N-2 code assignments must be made. Sometimes we sacrifice coding efficiency for reducing the number of computations.
Huffman Coding
This coding reduces average number of bits/pixel. It assigns variable length bits to different symbols. Achieves compression in 2 steps.
Steps
1. Find the gray level probabilities from the image histogram. 2. Arrange probabilities in reverse order, highest at top. 3. Combine the smallest two by addition, always keep sum in
reverse order. 4. Repeat step 3 until only two probabilities are left. 5. By working backward along the tree, generate code by alternating assignment of 0 & 1.
Extra Notes:
Arithmetic coding
Arithmetic
compression is an alternative to Huffman compression, it enables characters to be represented as fractional bit lengths. Unlike for Huffman compression, where fractional code lengths are not possible and the allocation of shorter codewords for more frequently occurring characters needs at least one-bit codeword no matter how high its frequency. Arithmetic coding works by representing a number by an interval of real numbers greater or equal to zero, but less than one. As a message becomes longer, the interval needed to represent it becomes smaller and smaller, and the number of bits needed to specify it increases.
Entire sequence of source symbol (message) is assigned a single arithmetic code word. There is no one to one coding like Huffman The code word is within interval [0, 1] As the number of symbols in the message increases, the interval used to represent it becomes smaller and the number of information units (bits) required to represent the interval becomes larger Ex. More bits are required to represent 0.003 than 0.1
coding works conceptually as follows: (1) Begin with current range [L,H) initialized to [0,1). Note : We denote brackets [0,1) in such a way to show that it is equal to or greater than 0 but less than 1. (2) For each symbol of the file, we perform two steps : a) Subdivide the current interval into subintervals, one for each alphabet symbol. b) Select the subinterval corresponding to the symbol that actually occurs next in the file and make it the new current interval.
Fig : Arithmetic coding procedure So, any number in the interval [0.06752, 0.0688) , for example 0.068 can be used to represent the message. Here 3 decimal digits are used to represent the 5 symbol source message. This translates into 3/5 or 0.6 decimal digits per source symbol and compares favorably with the entropy of -(3x0.2log100.2+0.4log100.4) = 0.5786 digits per symbol As the length of the sequence increases, the resulting arithmetic code approaches the bound set by entropy. In practice, the length fails to reach the lower bound, because: The addition of the end of message indicator that is needed to separate one message from another The use of finite precision arithmetic
The Algorithm:
A codebook or dictionary containing the source symbols is constructed. For 8-bit monochrome images, the first 256 words of the dictionary are assigned to the gray levels 0-255 Remaining part of the dictionary is filled with sequences of the gray levels
Table : LZW Coding example Compression ratio = (8 x 16) / (10 x 9 ) = 64 / 45 = 1.4 Important features of LZW: The dictionary is created while the data are being encoded. So encoding can be done on the fly The dictionary need not be transmitted. Dictionary can be built up at receiving end on the fly If the dictionary overflows then we have to reinitialize the dictionary and add a bit to each one of the code words. Choosing a large dictionary size avoids overflow, but spoils compressions
Decoding LZW: Let the bit stream received be: 39 39 126126 256 258 260 259 257 126
In LZW, the dictionary which was used for encoding need not be sent with the image. A separate dictionary is built by the decoder, on the fly, as it reads the received code words. Recognize d 39 39 126 126 256 258 260 259 257 Encoded value 39 39 126 126 256 258 260 259 257 126 pixels 39 39 126 126 39-39 126-126 39-39126 126-39 39-126 126 256 257 258 259 260 261 262 263 264 39-39 39-126 126-126 126-39 39-39126 126-12639 39-39126-126 126-3939 39-126126 Dic. address Dic. Entry
5. Explain wavelet based image compression. In contrast to image compression using discrete cosine transform (DCT) which is proved to be poor in frequency localization due to the inadequate basis window, discrete wavelet transform (DWT) has a better way to resolve the problem by trading off spatial or time resolution for frequency resolution. Exploiting the structures between coefficients for removing redundancy Wavelet Coding
Fig : Wavelet coding system ( decoder) Advantages: Lossless and lossy compression. Progressive transmission by pixel accuracy resolution. Region-of-Interest Coding. Random code stream access and processing. Robustness to bit-errors. Content-based description. Side channel spatial information (transparency).
and
Arithmetic coding
Arithmetic
compression is an alternative to Huffman compression, it enables characters to be represented as fractional bit lengths. Unlike for Huffman compression, where fractional code lengths are not possible and the allocation of shorter code words for more frequently occurring characters needs at least one-bit codeword no matter how high its frequency. Arithmetic coding works by representing a number by an interval of real numbers greater or equal to zero, but less than one. As a message becomes longer, the interval needed to represent it becomes smaller and smaller, and the number of bits needed to specify it increases.
Entire sequence of source symbol (message) is assigned a single arithmetic code word. There is no one to one coding like Huffman The code word is within interval [0, 1] As the number of symbols in the message increases, the interval used to represent it becomes smaller and the number of information units (bits) required to represent the interval becomes larger Ex. More bits are required to represent 0.003 than 0.1
coding works conceptually as follows: (1) Begin with current range [L,H) initialized to [0,1). Note : We denote brackets [0,1) in such a way to show that it is equal to or greater than 0 but less than 1. (2) For each symbol of the file, we perform two steps : a) Subdivide the current interval into subintervals, one for each alphabet symbol. b) Select the subinterval corresponding to the symbol that actually occurs next in the file and make it the new current interval.
Huffman Coding
This coding reduces average number of bits/pixel. It assigns variable length bits to different symbols. Achieves compression in 2 steps.
Steps
6. Find the gray level probabilities from the image histogram. 7. Arrange probabilities in reverse order, highest at top. 8. Combine the smallest two by addition, always keep sum in
reverse order.
9. Repeat step 3 until only two probabilities are left. 10. By working backward along the tree, generate code by
7. Explain how compression is achieved in transform coding and explain the DCT.
Transform Coding
Three steps:
Divide
a data sequence into blocks of size N and transform each block using a reversible mapping Quantize the transformed sequence Encode the quantized values
Benefits - transform co efficiently, relatively uncorrelated - energy is highly compacted - reasonable robust relative to channel errors.
DCT is similar to DFT, but can provide a better approximation with fewer coefficients The coefficients of DCT are real valued instead of complex valued in DFT. The discrete cosine transform (DCT) is the basis for many image compression algorithms. One clear advantage of the DCT over the DFT is that there is no need to manipulate complex numbers. The equation for a forward DCT is
Where,
g ( x, y,u, v )
2N
Where,
( u) = 1 N ( u) = 2 N
Zig-zag Scan DCT Blocks To group low frequency coefficients in top of vector. Maps 8 x 8 to a 1 x 64 vector.
Data Redundancy
Various amount of data may be used to represent the same information. Data which either do not provide necessary information or provide the same information again are called redundant data. Removing redundant data from the image reduces the size.
Redundancies In Image
In image compression 3 basic data redundancies can be identified. 1. Coding redundancy (CR) 2. Interpixel redundancy (IR) 3. Psychovisual redundancy (PVR)
Data compression is achieved when one or more of these redundancies are reduced or eliminated
Coding redundancy
A natural m-bit coding method assigns m-bit to each gray level without considering the probability that gray level occurs with: Very likely to contain coding redundancy Basic concept:
Utilize the probability of occurrence of each gray level (histogram) to determine length of code representing that particular gray level: variable-length coding. Assign shorter code words to the gray levels that occur most frequently or vice versa.
Interpixel Redundancy
Caused by High Interpixel Correlations within an image, i.e., gray level of any given pixel can be reasonably predicted from the value of its neighbors (information carried by individual pixels is relatively small) spatial redundancy, geometric redundancy, interframe redundancy (in general, interpixel redundancy ) Interpixel redundancy occurs because adjacent pixels tend to be highly correlated. Adjacent pixel values tend to be close to each other. The value of a given pixel can be predicated from the value of its neighbors. Visual contribution of a single pixel to an image is redundant. To reduce inter pixel redundancy image is transformed in to more efficient format. For Ex. Difference between adjacent pixels can be used to store an image. This transformation process is called mapping Reverse of that is called inverse mapping
We can detect the presence of correlation between pixels (or interpixel redundancy) by computing the auto-correlation coefficients along a row of pixels
where A(n) =
1 N 1n f ( x, y) f ( x, y + n) N n y0 =
Maximum possible value of (n) is 1 and this value is approached for this image, both for adjacent pixels and also for pixels which are separated by 45 pixels (or multiples of 45).
Psychovisual Redundancy
Psychovisual redundancy refers to the fact that some information is more important to the human visual system than other types of information. Use of less no. of gray levels reduces the size of image. Elimination of psychovisually redundant data from an image results in a loss of quantitative information. This process is not reversible The key in image compression algorithm development is to determine the minimal data required to retain the necessary information. This is achieve by taking advantage of the redundancy that exists in the image. Any redundant information that is not required can be eliminated to reduce the amount of data used to represent the image The eye does not respond with equal sensitivity to all visual information. Certain information has less relative importance than other information in normal visual processing psychovisually redundant (which can be eliminated without significantly impairing the quality of image perception). The elimination of psychovisually redundant data results in a loss of quantitative information lossy data compression method. Image compression methods based on the elimination of psychovisually redundant data (usually called quantization) are usually applied to commercial broadcast TV and similar applications for human visualization.
coding
Algorithm
giving
numerical
Huffman Coding
This coding reduces average number of bits/pixel. It assigns variable length bits to different symbols. Achieves compression in 2 steps.
Steps
1. Find the gray level probabilities from the image histogram. 2. Arrange probabilities in reverse order, highest at top. 3. Combine the smallest two by addition, always keep sum in
reverse order.
4. Repeat step 3 until only two probabilities are left. 5. By working backward along the tree, generate code by
Calculating Lavg & Entropy Lavg= 2.2 bits/pixel Entropy = 2.14 bits/pixel Efficiency of Huffman code = 2.14/2.2 = 0.973 Constraint : symbol be coded one at a time Uniquely code able & decodable
Encoding
Decoding
10.
MN 1
=H
f + MNMN MN 1 MN1
F (u, v) =
0 p( x, y) = 1 0
M (f ) = y Hf
A necessary condition for M (f ) to have a minimum is that its gradient with respect to f is equal to zero. This gradient is given below
And by using the steepest descent type of optimization we can formulate an iterative rule as follows:
f = HTy 0
In this method we attempt to solve the problem of constrained restoration iteratively. As already mentioned the following functional is minimized
M (f , ) = y Hf
+ Cf
The necessary condition for a minimum is that the gradient of M (f , is equal to zero. That gradient is )
(f ) = M (f , ) =2[(HT H + CT C)f T y ] H f
The initial estimate and the updating rule for obtaining the restored image are now given by
It can be proved that the above iteration (known as Iterative CLS or Tikhonov-Miller Method) converges if
0< <
2 max
(HTH + CTC)
If the matrices H and C are block-circulant the iteration can be implemented in the frequency domain.