100% found this document useful (1 vote)

272 views107 pages

Data Compression Intro

The document provides an introduction to data compression techniques. It discusses lossless and lossy compression. Lossless compression allows for exact reconstruction of the original data, while lossy compression results in some loss of information or quality. Compression ratio and bit rate are introduced as measures of compression performance. Modeling data to identify and code redundancies is key to compression. Examples demonstrate modeling text and numeric data sequences to find patterns that can be compressed.

Uploaded by

madanram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

272 views107 pages

Data Compression Intro

Uploaded by

madanram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 107

DATA COMPRESSION

Lecture By

Kiran Kumar KV PESSE

Block diagram of Data Compression

Unit 1 Chapter 1

INTRODUCTION TO LOSSLESS COMPRESSION

Preface
Introduction
Data
Need for Compression

Compression Techniques
Lossless and Lossy Compression

Performance Measure

Modeling and Coding

Problems

Introduction
The word data means "to give", hence "something given". In geometry, mathematics, engineering, and so on, the terms given and data are used interchangeably. Also, data is a representation of a fact, figure, and idea. In computer science: data are numbers, words, images, etc., accepted as they stand.

Data(Analog)

First Images sent over Atlantic using submarine cable(telegraph) in 1920's

1964 Lunar Probe

Data (Analog)
To The Principal College
Respected Sir, Subject: Need a Heater in Class Students please fill in other things. Yours Sincerely Faculty

Data(Digital World)

Raw Data

Digital Data 011.0110101.

Difference b/w Data, Information and Knowledge?

Data is the lowest level of abstraction, information is the next level, and finally, knowledge is the highest level among all three. Data on its own carries no meaning. In order for data to become information, it must be interpreted and take on a meaning.
For example, the height of Mt. Everest is generally considered as "data", a book on Mt. Everest geological characteristics may be considered as "information", and a report containing practical information on the best way to reach Mt. Everest's peak may be considered as "knowledge".

Compression

What is the need of compression?

What are the different kinds of Compression?

Which is the better one ? Which technique is used more often? What the use of combining both the technique?

What is the need for compression?

Need for Compression

Weather Forecasting

Internet data

Broadband

Need for Compression

Planning Cities

Introduction to Lossless Compression

COMPRESSION TECHNIQUES

Different kinds of compression

Loss-less compression

Compressed data can be reconstructed back to the exact original data itself.

Lossy Compression

Compressed data cannot be reconstructed back to the exact original data.

Loss-less Compression

Involve no loss of information. Area: Text Compression.

Reconstructed text is identical to the original. Do not send money

Do now send money

Other Areas: Radiology, Satellite Imagery. Main Advantage? Zero Distortion

Main Disadvantage?

Amount of Compression is less when compared to lossy compression.

Lossy Compression

Disadvantage:

Data that have been compressed using lossy techniques generally cannot be recovered or reconstructed exactly.(Involves some loss of information) Much higher compression ratios.

Advantage:

Areas

Audio and Video Compression.

(MP3, MPEG, JPEG)

Introduction to Lossless Compression

MEASURE OF PERFORMANCE

How do we measure or quantify Compression performance?

Based on Relative complexity of the algorithm? Memory required to implement the algorithm. How fast the algorithm performs on a given machine, (Secondary).

or 1. The amount of compression?

How closely the reconstruction resembles the original. (Primary).

Compression Ratio

Mostly widely used measure to compute data compressed is -- Compression Ratio. Ratio of the number of bits required to represent the data before compression to the number of bits required to represent the data after compression. Example : Suppose storing an image made up of a square array of 256 X 256 pixels requires 65,536 bytes. The image is compressed and the compressed version requires 16,384 bytes. Compression Ratio = 4:1

Another Measure

Rate: It is the average number of bits required to represent a single sample. Consider the last example, 256*256 original image contains 65536 bytes. Hence each pixel contains 1 byte or 8 bits per pixel(sample). Now the compressed image contains 16384 bytes. So many bits does each pixel contain? The rate is 2 bits/pixel. Is the above 2 Measures fine for lossy compression?

Distortion
In lossy compression, the reconstruction differs from the original data.
In order to determine the efficiency of a compression algorithm, we have to quantify/measure the difference. The difference between the original and the reconstruction is called the distortion.
Lossy techniques are generally used for the compression of data that originate as analog signals, such as speech and video.
For comparison of speech and video, the final arbiter/judge of quality is human.(behavioral analysis)

Because human responses are difficult to model mathematically, many approximate measures of distortion are used to determine the fidelity/quality of the reconstructed waveforms.

Introduction to Lossless Compression

MODELING AND CODING

Modeling and Coding

1) Compression scheme can be either Loss-less or Lossy, based on the requirements/Appli.
2) Exact compression scheme depends on different factors. But the main factor it depends, is based on the characteristics of the data.

3) Eg. Technique used for compression of a text may not work well for compressing images.
The best approach for a given application, largely depends on the redundancies inherent in the data.

Modeling and Coding

Redundancies

Redundant means that is not needed.

Or that can be omitted without any loss, or significance.
Example: If we take an portrait image, most of the background is same and all of it need not be encoded at all.

The approach may work for one kind of data, but may not work for other kind of data ( a landscape or group photo).

Modeling and Coding

The development of data compression algorithms for a variety of data can be divided into two phases.
Modeling: Extract information about any redundancy present in the data and model it.

Coding: Description of the model and a "description" of how the data differ from the model are encoded (binary alphabet). The difference between the data and the model is often referred to as the residual.

Introduction to Lossless Compression

DATA MODELING EXAMPLES

Example 1
Q. Consider the following sequence of numbers X= {x1,x2, x3,...}: 9 11 11 11 14 13 15 17 16 17 20 21. How many bits are required to store or transmit every sample?

Ans. 5 bits/sample or by exploiting the structure of data.

1) Model the data Structure of a straight line. Y=mX'+c or Y=X'+8 2) Residue Difference b/w Model and Data e=X-X'. X'={1 2 ......}

010-11-101-1-111

Example 1
The residual sequence consists of only three numbers { -1,0, 1}. Assign a code of 00 to -1, 01 to 0 & 10 to 1, we need to use 2 bits to represent each element of the residual sequence. Therefore, we can obtain compression by transmitting the parameters of the model and the residual sequence. Lossy if only model is transmitted. Lossless if both residue/difference and parameter are transmitted. Q. Model the given data for compression. { 6 8 10 10 12 11 12 15 16}

{ 5 6 9 10 11 13 17 19 20}

Example 2
Q. Find the structure present in this data sequence. 27 28 29 28 26 27 29 28 30 32 34 36 38 I. Ans. No structure is found. Hence

II. Check for closeness of the values. III. 27 1 1 -1 -2 1 2 -1 2 2 2 2 IV. Send First value. Then send rest of residue . V. Are the bits/sample reduced? VI. Decoder adds current value to the previous decoded value to reconstruct back the original sequence.

Note
Techniques that use the past values of a sequence to predict the current value and then encode the error in prediction, or residual, are called predictive coding schemes. Note: Assuming both encoder and decoder know the model being used, we would still have to send the value of the first element of the sequence.

Example 3
Suppose we have the following sequence:

aba ray a ranba rrayb ranbfa rbfaa rbfaaa rbaway

To repesent above 8 Symbols, 3 bits/symbol are required. Suppose if we assign 1 bit to the symbol that occurs most often.
As there are 41 symbols in the sequence, this works out to approximately 2.58 bits per symbol. This means we have obtained a compression ratio of 1.16:1. (Huffman coding)

Dictionary Compression Scheme. (Letter/Words repeat)

Note
There will be situations in which it is easier to take advantage of the structure if we decompose the data into a number of components. We can then study each component separately and use a model appropriate to that component. There are a number of different ways to characterize data. Different characterizations will lead to different compression schemes. We can compress something with products from one vendor and reconstruct it using the products of a different vendor. International standards organizations have standards for various compression applications.

Unit-1 Chapter 2 Loss-Less Compression

MATHEMATICAL PRELIMINARIES FOR LOSS-LESS COMPRESSION

Overview

This chapter deals with Lossless scheme mathematical framework. Starting with Information Theory Basic Probability Concept. Based on the above mathematical concepts, modeling of the data.

Introduction to Information Theory

Quantitative measure of Information.

Father of Information Theory: Claude Elwood Shannon, an electrical Engineer at Bell Labs.

He defined a quantity called Self Information.

Example: Given an Random Experiment, if A is an Event occurring in a set of outcomes. The self information associated with A is given by

Where i(A) is the self information.

P(A) is the probability of the event A.

Introduction to Information Theory

Log(1) = 0 and -log(x) increases if x decreases.

Probability of an event is low, information associated with it is high and vice-versa.

Another Property: Information obtained from the occurrence of 2 independent events is the sum of information obtained form occurrence of the individual events.

Suppose A & B are 2 independent events. The self information associated with the occurrence of both event A & B is

Intro. To Information Theory

Entropy

Suppose we have independent events Ai, which are the outcomes of some experiment S.
Then the average self information associated with the random experiment is

This quantity is called entropy associated with the experiment.(Shannon)

Note: Entropy is also the measure of average no. of binary symbols needed to code the output.

Note

Most of the experiment results that we see in this subject are independent and identical distributed(iid). Above Entropy equation holds good only if the experiment is iid. Theorem: Shannon showed that the maximum average no. of bits that a loss-less compression scheme can achieve will be equal to the entropy of the source.

The estimate of the entropy depends on our assumptions about the structure of the source sequence.

Example 4
Q. Consider the sequence 1 2 3 2 3 4 5 4 5 6 7 8 9 8 9 10 The probability of occurrence of each element is
P(l) = P(6) = P(7) = P(10) = 1/16
P(2) = P(3) = P(4) = P(5) = P(8) = P(9) =2/16 Assuming the sequence is iid, the entropy for this sequence is first-order entropy calculated as

Entropy of this source is 3.25bits.

Hence w.r.t. Shannon the maximum no. of bits required to code the sample is 3.25 bits/sample.

Example 4

Step 2 : Model the given data to remove redundancy.

Solution: there is a sample-to-sample correlation between the samples and we remove the correlation by taking differences of neighboring sample values. 111-1111-111111-111 The sequence is constructed using two values 1 and -1. P(1) = 13/16 and P(-1)=3/16.

The entropy in this case is 0.70 bits per symbol.

Knowing only these sequence is not enough to reconstruct the original sequence. We must know the process by which this sequence was generated from the original. Process depends on the assumption about input data structure. Assumption = Model

Note

If the parameter rn does not changes with n , then it is called static model. A model whose parameters does not change or adapt with n to the changing characteristics of the data is called an adaptive model. Basically, we see that knowing something about the structure of the data can help to "reduce the entropy."

Structure
Consider the following sequence:

1 2 1 2 3 3 3 3 1 2 3 3 3 3 1 2 3 3 12
Obviously, there is some structure to this data. However, if we look at it one symbol at a time, the structure is difficult to extract. Consider the probabilities: P{1) = P{2) = 1/4 , and p(3) = 1/2. The entropy is 1.5 bits/symbol. This particular sequence consists of 20 symbols; therefore, the total number of bits required to represent this sequence is 30.

Now let's take the same sequence and look at it in blocks of two. Obviously, there are only two symbols, 1 2, and 3 3. The probabilities are P(l 2) = 1/2, P(3 3) = 1/2, and the entropy is 1 bit/symbol. As there are 10 such symbols in the sequence, we need a total of 10 bits to represent the entire sequencea reduction of a factor

Derivation of Average Information Not in Syllabus

Models

Good models for sources lead to more efficient compression algorithms. In general, in order to develop techniques that manipulate data using mathematical operations, we need to have a Mathematical model for the data. There are several approaches to build Mathematical model.

Physical Model Probability Model Markov Model Composite Source Model

Physical Model
If we know something about the physics of the data generation process, we can use that information to construct a model. For example, In speech-related applications, knowledge about the physics of speech production can be used to construct a mathematical model for the sampled speech process. Sampled speech can then be encoded using this model. If residential electrical meter readings at hourly intervals were to be coded, knowledge about the living habits of the populace could be used to determine when electricity usage would be high and when the usage would be low. Then instead of the actual readings, the difference (residual) between the actual readings and those predicted by the model could be coded.

Physical Model
Disadvantages
In general, however, the physics of data generation is simply too complicated to understand, let alone use to develop a model. Since the physics of the problem is too complicated, currently we can a model based on empirical observation of the statistics in data.

Probability Model

The simplest Mathematical model for the source is to assume that all the events are independent and identically distributed(IID). Hence the name ignorance model.

Used when we dont know anything about the source.

Next lets assume the events are independent but not equally distributed. Using Entropy equation we can find the entropy. For a source that generates letters from an alphabet A ={a1,a2 , . . . , aM}, can be represented by a probability model P = {P(a1), P(a2) ........P(aM)}

Probability Model
Next if we discard the assumption of independence also, we come up with a better data compression scheme but we have to define the dependency of data sequence on each other.

One of the most popular ways of representing dependence in the data is through the use of Markov models, named after the Russian mathematician Andrei Andrevich Markov (1856-1922).

Markov Models
For models used in loss-less compression, we use a specific type of Markov process called a discrete time Markov chain. Let {Xn} be a sequence, which is said to follow a Kthorder morkov model if P(Xn|Xn-1,........,Xn-k) = P(Xn|Xn-1,........,Xn-k,.........) The knowledge of the past k symbols is equivalent to the knowledge of the entire past history of the process. The values taken on by the set process {Xn-1 . . . . . . ,........,Xn-k} are called the states of the process.

Markov Models
The most commonly used Markov model is the firstorder Markov model, for which
P(Xn|Xn-1) = P(Xn|Xn-1,Xn-2,Xn-3.......,) Markov chain property: probability of each subsequent state depends only on what was the previous state: The above equations indicate the existence of dependence between samples. However, they do not describe the form of the dependence. We can develop different first-order Markov models depending on our assumption about the form of the dependence between samples.

To define Markov model, the following probabilities have to be specified: transition probabilities P{X2|X1} and initial probabilities P{X1}.

Markov Models
If we assumed that the dependence was introduced in a linear manner, we could view the data sequence as the output of a linear filter driven by white noise. The output of such a filter can be given by the difference equation

En is the white noise. This model is often used when developing coding algorithms for speech and images.

The Markov model does not need the assumption of linearity

Markov Model Example

For example, consider a binary image. The image has only two types of pixels, white pixels and black pixels. Q. Based on the current pixel appearance, can we predict the appearance the next observation? Ans. Yes, we can model the pixel process as a discrete time Markov chain. Define two states Sw and Sb.{Sw} would correspond to the case where the current pixel is a white pixel, and {Sb} corresponds to the case where the current pixel is a black pixel). We define the transition probabilities P{w/b)and P(b/w), and the probability of being in each state P(Sw) and P{Sb). The Markov model can then be represented by the state diagram shown in Figure.

Markov Model

The entropy of a finite state process with states S, is simply the average value of the entropy at each state:

Example of Markov Model

0.3 0.7

Rain

Dry

0.2

0.8

Two states : Rain and Dry. Transition probabilities: P(Rain|Rain)=0.3 , P(Dry|Rain)=0.7 , P(Rain|Dry)=0.2, P(Dry|Dry)=0.8 Initial probabilities: say P(Rain)=0.4 , P(Dry)=0.6 .

Markov Model Example

Markov Model in text Compression

Probability of the next letter is heavily influenced by the preceding letter in English. Current Text Compression literature, the k-order Markov models are widely known as finite context models, the word context is being used for state. Example:

Consider the word preceding . Suppose we have already processed precedin and are going to encode the next letter. If no account of context is taken into consideration and we treat the letter as a surprise, the probability of the letter g occurring is relatively low.

Example
If we use a first-order Markov model (i.e. we look at n probability model), we can see that the probability of g would increase substantially. As we increase the context size (go from n to in to din and so on), the probability of the alphabet becomes more and entropy decreases.

Shannon used a second-order model for English text consisting of the 26 letters and one space to obtain an entropy of 3.1 bits/letter . Using a model where the output symbols were words rather than letters brought down the entropy to 2.4 bits/letter. Note: The longer the context, the better its predictive value.

Markov Model in Text Compression

Disadvantage: To store the probability model with respect to all contexts of a given length, the number of contexts would grow exponentially with the length of context. Source imposes some structure on its output, many of these contexts may correspond to strings that would never occur in practice. Different sources may have different repeating patterns. Solution: PPM(Prediction & Partial March) Algorithm. Context is found for the symbol of non-zero/max probability first(Encoding). Zero Prob symbols are replaced with escape symbols and computed.

Composite Source Model

In many applications, it is not easy to use a single model to describe the source. In such cases,we can define a composite source, which can be viewed as a combination or composition of several sources, with only one source being active at any given time. Source Si will have its own Model Mi based on probability Pi.

Coding
Coding: Assignment of binary sequences (0s or 1s) to elements or symbols.
The set of binary sequences is called a code, and the individual members of the set are called codewords.
Code ( 100101100110010101) Codewords ( a -> 001, b -> 010) An alphabet is a collection of symbols called letters. For example, the alphabet used in writing most books consists of the 26 lowercase letters, 26 uppercase letters, and a variety of punctuation marks. In the terminology used in this book, a comma is a letter. The ASCII code for the letter a is 1000011, the letter A is coded as 1000001, and the letter "," is coded as 0011010. Notice that the ASCII code uses the same number of bits to represent each symbol. Such a code is called a fixed-length code.

Coding
To reduce the number of bits required to represent different messages, we need to use a different number of bits to represent different symbols. If we use fewer bits to represent symbols that occur more often, on the average we would use fewer bits per symbol. The average number of bits per symbol is often called the rate of the code. Example: Morse Code, Huffman code. letters that occur more frequently are shorter than for letters that occur less frequently. The codeword for E is 1 bit while the codeword for Z is 7 bits.

Uniquely Decodable Codes

Average length of the code, is not only the criteria for good code.
Example Suppose our source alphabet consists of four letters a1, a2, a3 & a4 with probabilities P(a1) = 1/2 , P(a2) = 1/4, P(a3) = P(a4) = 1/8. The entropy for this source is 1.75 bits/symbol.

where n(ai) is the number of bits in the codeword for letter ai and the average length is given in bits/symbol.

Uniquely Decodable Codes

From the table, w.r.t average length Code1 appears to be the best code. However code should have the ability to transfer information in an unambiguous way.

Uniquely Decodable Codes

Code 1

Both a1 and a2 have been assigned the codeword 0. When a 0 is received, there is no way to know whether an a1 was transmitted or an a2. Hence we would like each symbol to be assigned a unique codeword.

Uniquely Decodable Code

Code 2 seems to have no problem with ambiguity. However if we encode {a2 a1 a1}. Binary string would be 100. However 100 can be decoded as {a2 a1 a1} and {a2 a3} Meaning original sequence cannot be recovered with certainty. There is no Unique decodability. (Not Desirable)

Uniquely Decodable Code

How about Code 3?

First 3 codewords end with 0. 0 denotes termination of codeword.

And a4 codeword is 3-bit 1's. Which is easily decodable.

Code 3
Code 3?

Notice that the first three codewords all end in a 0. In fact, a 0 always denotes the termination of a codeword.
The final codeword contains no 0s and is 3 bits long. Because all other codewords have fewer than three 1s and terminate in a 0, the only way we can get three as in a row is as a code for a4. The decoding rule is simple. Accumulate bits until you get a 0 or until you have three 1s. There is no ambiguity in this rule, and it is reasonably easy to see that this code is uniquely decodable.

Uniquely Decodable Code

Code 4.
Each codeword starts with a 0, and the only time we see a 0 is in the beginning of a codeword. Decoding rule is to accumulate bits until you see a 0. The bit before 0 is the last bit of the previous codeword.

Code 4
Difference between Code 3 and Code 4 is that the Code 3, decoder knows the moment a code is complete. In Code 4, we have to wait till the beginning of the next codeword before we know that the current codeword is complete. Because of this property, Code 3 is called an instantaneous code and code 4 is near instantaneous code. Q). Is code 4 Uniquely Decodable? Ans. Decode the string 011111111111111111.

Uniquely Decodable Code

Instantaneous and near-Instantaneous Decode 011111111111111111 from Code 5.

From the above string

First codeword can be either 0 (a1) or 01(a2).

Assuming 1st codeword is a1, after decoding other 8 codewords as a3's. We will be left with a single (dangling) 1. If we assume 1st codeword as a2, we will be able to decode next 16 codewords as 8 a3's.
The string can be uniquely decoded. In fact. Code 5, while it is certainly not instantaneous, but is uniquely decodable in one way and not unique in other way.

Uniquely Decodable Code

Decode the 01010101010101010 from Code 6.

Step 1: Decode a1 and 8 a3's.

Step 2: Decode 8 a2's and one a1.

Not Uniquely Decodable.

Even with looking at these small codes, it is not immediately evident whether the code is uniquely decodable or not. For Lager codes?

Hence a systematic procedure should be followed to test for unique decodability.

A Test for Unique Decodability is not there in portion

Test for Unique Decodability: Example 1

Consider Code 5.

First list the codewords

{0,01,11} The codeword 0 is a prefix for the codeword 01. Hence the dangling suffix is 1.

There are no other pairs for which one element of the pair is the prefix of the other.

Example 1
Let us augment (add) the codeword list with the dangling suffix. {0,01,11,1} Comparing the elements of this list, we find 0 is a prefix of 01 with a dangling suffix of 1. But we have already included 1 in our list.

Also, 1 is a prefix of 11. This gives us a dangling suffix of 1, which is already in the list.

Example 1
There are no other pairs that would generate a dangling suffix, so we cannot augment the list any further. Therefore, Code 5 is uniquely decodable.

Test for Unique Decodability: Example 2

Consider Code 6. First list the codewords

{0,01,10}

The codeword 0 is a prefix for the codeword 01. The dangling suffix is 1. There are no other pairs for which one element of the pair is the prefix of the other. Augmenting the codeword list with 1, we obtain the list

{0,01,10,1}

Example 2
In this list, 1 is a prefix for 10. The dangling suffix for this pair is 0, which is the codeword for a1. Therefore, Code 6 is not uniquely decodable.

Prefix Codes
The test for unique Decodability requires examining the dangling suffixes. If the dangling suffix is itself a codeword, then the code is not uniquely decodable.

One type of code in which we will never face the possibility of a dangling suffix being a codeword is a code in which no codeword is a prefix of the other
A code in which no codeword is a prefix to another codeword is called a prefix code. A simple way to check if a code is a prefix code is to draw the rooted binary tree corresponding to the code.

Prefix Codes
Draw a tree that starts from a single node(the root node) & has possible 2 branches at each node. One of the branch corresponds to 1 and the other 0. The Convention followed is that root node at the top, left branch is 0 and the right branch is 1. Using convention, draw binary tree for Code 2, 3 & 4.

Prefix Codes

Prefix Codes
Note that apart from the root node, the trees have 2 Kinds of nodes. 1. Internal nodes (nodes that give rise to other nodes)and

2. External nodes(doent give rise to other nodes) They are also called leaves.
In a prefix code, the codewords are only associated with the external nodes. A code that is not a prefix code, such as Code 4, will have codewords associated with internal nodes.(01111111111111).

The Kraft-McMillan Inequality Not in Syllabus

Algorithmic information Theory

Information theory dealt with data and corresponding from it.

While Algorithmic Information Theory deals with program you code for compressing the data.
At the heart of algorithmic information theory is a measure called Kolmogorov complexity.

The Kolmogorov complexity k(x)) of a sequence x is the size of the program needed to generate x.
Size: includes all the needed i/p's for the program are present.

Algorithmic information Theory

If x was a sequence of all ones, a highly compressible sequence, the program would simply be a print statement in a loop.
On the other extreme, if x were a random sequence with no structure then the only program that could generate it would contain the sequence itself. The size of the program, would be slightly larger than the sequence itself. Thus, there is a clear correspondence between the size of the smallest program that can generate a sequence and the amount of compression that can be obtained. Lower bound uncertain and is not practically used.

Huffman Coding

Huffman Coding Overview

The Huffman Coding Algorithm Developer: David Huffman class assignment; information theory, taught by Robert Fano at MIT. These codes are prefix codes and are optimum for a given model (set of probabilities).

The Huffman procedure is based on two observations regarding optimum prefix codes.

1. In an optimum code, symbols that occur more frequently (have a higher probability of occurrence) will have shorter codewords than symbols that occur less frequently. 2. In an optimum code, the two symbols that occur least frequently will have the same length.

Design of Huffman Code

Let us design a Huffman code for a source that puts out letters from an alphabet A = {a1, a2, a3, a4, a5} with P(a1) = P(a3) = 0.2, P(a2) = 0.4, and P(a4) = P(a5) = 0.1. First find first order Entropy? Step1: Sort the letters in Descending Probability order.

Huffman Coding Algorithm Example

Step 3: Find the average length.

L = . 4 x l + . 2 x 2 + . 2 x 3 + . l x 4 + . l x 4 = 2.2 bits/symbol.

Step 4: Calculate Redundancy?

Example

Step 3: Find the average length.

L = . 4 x l + . 2 x 2 + . 2 x 3 + . l x 4 + . l x 4 = 2.2 bits/symbol.

Step 4: Calculate Redundancy?

Step 5: Binary Huffman Tree

Example 2

Transmit 28 data samples using Huffman Code. 11111112222223333344445 55667

Minimum Variance Huffman Coding

L = . 4 x 2 + . 2 x 2 + . 2 x 2 + . 1 x 3 + . 1 x 3 = 2.2 bits/symbol. The two codes are identical in terms of their redundancy. However, the variance of the length of the codewords is significantly different.

Minimum Variance Huffman Codes

Remember that in many applications, although you might be using a variable-length code, the available transmission rate is generally fixed. For example, if we were going to transmit symbols from the alphabet we have been using at 10,000 symbols per second, we might ask for transmission capacity of 22,000 bits per second. This means that during each second the channel expects to receive 22,000 bits, no more and no less. As the bit generation rate will vary around 22,000 bits per second, the output of the source coder is generally fed into a buffer. The purpose of the buffer is to smooth out the variations in the bit generation rate.

However, the buffer has to be of finite size, and the greater the variance in the codewords, the more difficult the buffer

Minimum variance Huffman coding

Suppose that the source generates a string of a4 and a5 for several seconds. If we are using the first code, this means that we will be generating bits at a rate of 40,000 bits per second. For each second, the buffer has to store 18,000 bits

On the other hand, if we use the second code, we would be generating 30,000 bits per second, and the buffer would have to store 8000 bits for every second. If we have a string of a2's instead of a string of a4 and a5 the first code would result in the generation of 10,000 bits per second. Deficit of 12000 bits per second.
,

Second code would lead to deficit of 2000 bits per second. So which do we select?

Application of Huffman coding

Huffman coding is often used in conjunction with other coding techniques in

Loss-less Image Compression Text Compression

Audio Compression

Loss-Less Image Compression

Monochrome Image

Pixel(0-255)

Monochrome Images

Compression of test images using Huffman Coding

Original (Uncompressed) test images are represented using bits/pixel. The image consists of 256 rows of 256 pixels, so the uncompressed representation uses 65,536 bytes.

Image Compression
From a visual inspection of the test images, we can clearly see that the pixels in an image are heavily correlated with their neighbors. We could represent this structure with the crude model Xn = Xn-1 The residual would be the difference between neighboring pixels.

Huffman Coding Text Compression

We encoded the earlier version of this chapter using Huffman codes that were created using the probabilities of occurrence obtained from the chapter. The file size dropped from about 70,000 bytes to about 43,000 bytes with Huffman coding.

Audio Compression

The End of Unit 1 Any Thoughts , Doubts or Ideas |

1Z0-1087-25 dumps verified by experts
No ratings yet
1Z0-1087-25 dumps verified by experts
4 pages
GeorgiaTech CS-6515: Graduate Algorithms: CS6515 - Exam2 Flashcards by Daniel Barker - Brainscape
No ratings yet
GeorgiaTech CS-6515: Graduate Algorithms: CS6515 - Exam2 Flashcards by Daniel Barker - Brainscape
89 pages
Assignment 1 Computer Architecture
No ratings yet
Assignment 1 Computer Architecture
3 pages
Dbms PPT For Chapter 7
No ratings yet
Dbms PPT For Chapter 7
45 pages
Data Communication and Computer Networking: Content
No ratings yet
Data Communication and Computer Networking: Content
42 pages
Data Compression
No ratings yet
Data Compression
4 pages
AI-UNIT-2 PPT
No ratings yet
AI-UNIT-2 PPT
135 pages
Modeling and Detection of Camouflaging Worm
No ratings yet
Modeling and Detection of Camouflaging Worm
37 pages
Dce Sem 5 Syllabus
0% (1)
Dce Sem 5 Syllabus
3 pages
COSC 3100 Brute Force and Exhaustive Search: Instructor: Tanvir
No ratings yet
COSC 3100 Brute Force and Exhaustive Search: Instructor: Tanvir
44 pages
Compiler-All-Anna-Question Till Nov-2016 PDF
No ratings yet
Compiler-All-Anna-Question Till Nov-2016 PDF
39 pages
Bca Solved Assignment BCSL 021 C Language Programming Lab
No ratings yet
Bca Solved Assignment BCSL 021 C Language Programming Lab
7 pages
DAA Unit - 1
No ratings yet
DAA Unit - 1
68 pages
Handoff Strategies in Cellular System PDF
No ratings yet
Handoff Strategies in Cellular System PDF
7 pages
CISC 867: Deep Learning Assignment #1: K J Net
No ratings yet
CISC 867: Deep Learning Assignment #1: K J Net
3 pages
A Wireless Intrusion Detection System and A New Attack Model
No ratings yet
A Wireless Intrusion Detection System and A New Attack Model
28 pages
Unit-1 DAA Notes - Daa Unit 1 Note Unit-1 DAA Notes - Daa Unit 1 Note
No ratings yet
Unit-1 DAA Notes - Daa Unit 1 Note Unit-1 DAA Notes - Daa Unit 1 Note
26 pages
Cs-825 Msitcs Ir
No ratings yet
Cs-825 Msitcs Ir
3 pages
Features of Pentium and Above Microprocessors
0% (1)
Features of Pentium and Above Microprocessors
5 pages
DCCN Prefinal Paper
No ratings yet
DCCN Prefinal Paper
2 pages
A Model For Network Security
No ratings yet
A Model For Network Security
1 page
Object Oriented Programming in C++
No ratings yet
Object Oriented Programming in C++
4 pages
Final Digital Logic Design
No ratings yet
Final Digital Logic Design
137 pages
PPS Course Material
100% (1)
PPS Course Material
177 pages
Viva questions for Soft Computing
No ratings yet
Viva questions for Soft Computing
5 pages
DCCN Notes
No ratings yet
DCCN Notes
27 pages
DAV Quantum
No ratings yet
DAV Quantum
143 pages
Internet & World Wide Web HOW To PROGRAM - Lecture Notes, Study Materials and Important Questions Answers
No ratings yet
Internet & World Wide Web HOW To PROGRAM - Lecture Notes, Study Materials and Important Questions Answers
15 pages
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
DSAD Dynamic Hashing
No ratings yet
DSAD Dynamic Hashing
79 pages
Syllabus
No ratings yet
Syllabus
9 pages
CS8591-Computer Networks Department of CSE 2020-2021
No ratings yet
CS8591-Computer Networks Department of CSE 2020-2021
24 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
MC4102 OOSE Question bank
No ratings yet
MC4102 OOSE Question bank
4 pages
Module-1 PPT Data Communication
100% (1)
Module-1 PPT Data Communication
168 pages
Co Po Mapping
No ratings yet
Co Po Mapping
3 pages
Ada Lab Manual (Mca Iv Sem Vtu)
No ratings yet
Ada Lab Manual (Mca Iv Sem Vtu)
64 pages
Iterative Algorithms and Their Analysis - Asymptotic Notations
No ratings yet
Iterative Algorithms and Their Analysis - Asymptotic Notations
17 pages
Problem Solving and Python Programming L T P C
No ratings yet
Problem Solving and Python Programming L T P C
1 page
Iare DLD PPT 0
No ratings yet
Iare DLD PPT 0
294 pages
Python Application Programming - 18CS752 - Syllabus
No ratings yet
Python Application Programming - 18CS752 - Syllabus
4 pages
Ai-Unit-Iii Notes
No ratings yet
Ai-Unit-Iii Notes
46 pages
Error Detection and Correction
No ratings yet
Error Detection and Correction
16 pages
Cs8351 Digital Principles and System Design
No ratings yet
Cs8351 Digital Principles and System Design
161 pages
Strings in Python
No ratings yet
Strings in Python
30 pages
8 A Access Protection in Java Packages
No ratings yet
8 A Access Protection in Java Packages
4 pages
CNS Unit 3
100% (1)
CNS Unit 3
207 pages
Mobile Communications Chapter 5: Satellite Systems: History Basics Localization Handover Routing Systems
No ratings yet
Mobile Communications Chapter 5: Satellite Systems: History Basics Localization Handover Routing Systems
30 pages
CN Unit 2 Notes
No ratings yet
CN Unit 2 Notes
53 pages
Delays in Computer Networks
No ratings yet
Delays in Computer Networks
5 pages
Enterprise Information Architecture Component Model - Chapter 5
100% (1)
Enterprise Information Architecture Component Model - Chapter 5
27 pages
Cs2041 Csharp Unit II Notes
No ratings yet
Cs2041 Csharp Unit II Notes
26 pages
Computer Network Question Bank cs8591
No ratings yet
Computer Network Question Bank cs8591
9 pages
The Evaluation of Operating System
No ratings yet
The Evaluation of Operating System
6 pages
Lecture 1 - Software Evolution Process
No ratings yet
Lecture 1 - Software Evolution Process
40 pages
Chapter 9 Undecidability
100% (1)
Chapter 9 Undecidability
48 pages
Unit 2 Searching and Sorting
No ratings yet
Unit 2 Searching and Sorting
83 pages
Building Websites with VB.NET and DotNetNuke 4
From Everand
Building Websites with VB.NET and DotNetNuke 4
Daniel N. Egan
1/5 (1)
Excel 2013/2016: Get Your Hands Dirty
From Everand
Excel 2013/2016: Get Your Hands Dirty
Sam Akrasi
No ratings yet
MOD 1 DCT
No ratings yet
MOD 1 DCT
37 pages
Unit 1 - CA209 Zohaib
No ratings yet
Unit 1 - CA209 Zohaib
24 pages
Data Compression
No ratings yet
Data Compression
20 pages
Psalm 131
No ratings yet
Psalm 131
1 page
Summary of Module 2
No ratings yet
Summary of Module 2
4 pages
Frost Poetry PDF
No ratings yet
Frost Poetry PDF
6 pages
Inductive Deductive Methods
No ratings yet
Inductive Deductive Methods
16 pages
漫谈WebLogic CVE 2020 2551
No ratings yet
漫谈WebLogic CVE 2020 2551
21 pages
SKVT1213 - A1.4 - Due Week 8 (A193281)
No ratings yet
SKVT1213 - A1.4 - Due Week 8 (A193281)
3 pages
Ishmael in Islam - Wikipedia
No ratings yet
Ishmael in Islam - Wikipedia
8 pages
Transport - Entry 1 - Induction
No ratings yet
Transport - Entry 1 - Induction
16 pages
Lec 06
No ratings yet
Lec 06
78 pages
Binary Search Tree - Javatpoint
No ratings yet
Binary Search Tree - Javatpoint
19 pages
UNIT 1 - KEY
No ratings yet
UNIT 1 - KEY
6 pages
PDF Ghostwriting Writing Handbooks Andrew Crofts download
100% (5)
PDF Ghostwriting Writing Handbooks Andrew Crofts download
82 pages
学习目标 Objective: 学习基本笔画和复合笔画 Learn basic and compound strokes
No ratings yet
学习目标 Objective: 学习基本笔画和复合笔画 Learn basic and compound strokes
11 pages
Synopsis On Car Parking System
No ratings yet
Synopsis On Car Parking System
5 pages
Have Fun, Learn English: Department of Primary Education Primary Directorate of Curriculum Standards and Pedagogy
No ratings yet
Have Fun, Learn English: Department of Primary Education Primary Directorate of Curriculum Standards and Pedagogy
40 pages
AutexTification IberLEF 2023 3 Junio
No ratings yet
AutexTification IberLEF 2023 3 Junio
18 pages
job interview rubrics
No ratings yet
job interview rubrics
1 page
Module 8 Sample Results and Discussion Conclusion and Recommendation
No ratings yet
Module 8 Sample Results and Discussion Conclusion and Recommendation
19 pages
BRCCO - 2024 - 11156117 Santiosh Residence
No ratings yet
BRCCO - 2024 - 11156117 Santiosh Residence
2 pages
Noun Assignment
No ratings yet
Noun Assignment
3 pages
1 L Annual Learning Progression By. MR Benguemmar Nacer
No ratings yet
1 L Annual Learning Progression By. MR Benguemmar Nacer
3 pages
Friendships Lesson Plan
50% (2)
Friendships Lesson Plan
5 pages
Arawak
No ratings yet
Arawak
5 pages
Namma Kalvi 10th English Model Question Papers WTS 221552 (1)
No ratings yet
Namma Kalvi 10th English Model Question Papers WTS 221552 (1)
28 pages
Booklet Unit 3 (New) 2024
No ratings yet
Booklet Unit 3 (New) 2024
21 pages
Literary Translation Introduction
No ratings yet
Literary Translation Introduction
32 pages
PHIL LIT. - The Drama
No ratings yet
PHIL LIT. - The Drama
6 pages
Data Analysis(Beginners Class) Data Set
No ratings yet
Data Analysis(Beginners Class) Data Set
74 pages