0% found this document useful (0 votes)

60 views

Elective: Data Compression and Encryption V Extc ECCDLO 5014

This document provides an overview of data compression techniques. It begins by explaining why data compression is useful for reducing the size of files during transmission and storage. It then describes different types of compression including lossless compression for retaining all original data and lossy compression which can tolerate some loss of data. Key aspects covered include exploiting redundancies in data and properties of human perception. Specific examples of compressing text, images, audio and video are given to illustrate compression techniques.

Uploaded by

anitasjadhav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views

Elective: Data Compression and Encryption V Extc ECCDLO 5014

Uploaded by

anitasjadhav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 60

Elective: Data Compression and

Encryption
V EXTC
ECCDLO 5014

Anita Jadhav
Course Outcome
C308.1 analyze techniques of text compression in order to solve
(CO1) numerical related to statistical/dictionary based text
compression techniques
C308.2 explain audio/image/video compression standards.
(CO2)
C308.3 describe goals of cryptography and standards in Private Key
(CO3) cryptosystems.

C308.4 analyze number theoretic techniques and solve numericals

(CO4) related to Public Key Cryptography.

C308.5 comprehend societal issues related to Network Security and

(CO5) describe their solutions.
What is data compression?
• Compression is an “art and science” for
reduction in size of data without compromising
on its utility.
• Why ‘art’?
• It requires artistic approaches to identify what
can be retained and what can be thrown..
• Why ‘science’?
• There are well designed mathematical methods
for retaining and throwing up!!
Why Compress?

• Downloading digital color photograph given a 33.6 kbps

modem:

• Uncompressed image (TIFF file) = 600 kbytes, 142 seconds

• Lossless compression (GIF file) = 300 kbytes, 71 seconds
• Lossy compression (JPEG file) = 50 kbytes, 12 seconds
Why Compress?
• To reduce the volume of data to be transmitted
(text, fax, images)

• To reduce the bandwidth required for

transmission and to reduce storage
requirements (speech, audio, video)
Philosophy
• Compression is possible by exploiting redundancies in the data and
properties of human perception.

• Digital audio is a series of sample values; image is a rectangular array of

pixel values; video is a sequence of images played out at a certain rate
• Neighboring sample values are correlated
Redundancy
• Adjacent audio samples are similar (predictive
encoding); samples corresponding to silence (silence
removal)
• In digital image, neighboring samples on a scanning
line are normally similar (spatial redundancy)
• In digital video, in addition to spatial redundancy,
neighboring images in a video sequence may be similar
(temporal redundancy)
Redundancies contd..
• Text files
Frequently used characters or groups of characters

• Image files
Adjacent pixels in an image (spatial redundancy)

• Audio Files
Silence (silence removal)
Neighboring samples (predictive encoding)

• Video Files
Similar neighboring images (temporal redundancy)
Human Perception factors
• Compressed version of digital audio, image, video need not
represent the original information exactly.. But fact is we are okay
with it!!
• Perception sensitivities are different for different signal patterns, eg.
We do not hear all the frequencies in 20 Hz to 20 KHz range in the
same way..(MP-III)
• Human eye is less sensitive to the higher spatial frequency
components than the lower frequencies (transform coding)
• https://www.youtube.com/watch?time_continue=110&v=bh_9XFzb
WV8
Model for compression

Courtesy: www.scienceblogs.com
Classification
• Lossless compression
– lossless compression for legal and medical
documents, computer programs
– exploit only data redundancy
• Lossy compression
– digital audio, image, video where some errors or
loss can be tolerated
– exploit both data redundancy and human
perception properties
• Constant bit rate versus variable bit rate
coding
Classification
• Logical Vs. Physical Compression
Physical compression acts directly on the data; it is thus a question of
storing the redundant data from one bit pattern to another.
Logical compression on the other hand is carried out by a logical
reasoning, substituting this information with equivalent information.
Eg. Voice box parameters

• Symmetric Vs. Asymmetric

In the case of symmetrical compression, the same method is
used to compress and to decompress the data. The same
amount of work is thus needed for each of these operations.
Classification
• Certain compression algorithms are based on dictionaries that are for
a specific type of data: these are non-adaptive encoders. The
occurrence of letters in a text file, for example, depends on the
language in which it is written.
• An adaptive encoder adapts to the data which it will have to
compress, it does not start out with an already prepared dictionary
for a given type of data.
• A semi-adaptive encoder will build a dictionary according to the data
to be compressed: it builds the dictionary by going through the file
and then compresses the latter.
Data Compression
• Let’s look at an example.
• Let’s imagine we had to send the following
message:

The rain in Spain lies mainly in the plain

Data Compression
• If we had to send this as it is down a wire:

The rain in Spain lies mainly in the plain

Data Compression

• The a total of 42 characters (including 8 spaces)

The rain in Spain lies mainly in the plain

Data Compression

• The a total of 42 characters (including 8 spaces)

The rain in Spain lies mainly in the plain

Data Compression

• Lets replace the word “the” with the number 1.

The rain in Spain lies mainly in the plain

the =1
Data Compression

• Lets replace the word “the” with the number 1.

1 rain in Spain lies mainly in 1 plain

the =1
Data Compression

• Lets replace the word “the” with the number 1.

• We’ve reduced the of characters to 38.

1 rain in Spain lies mainly in 1 plain

the =1
Data Compression

• Lets replace the letters “ain” with the number

1 rain in Spain lies mainly in 1 plain

the =1
Data Compression ain =2

• Lets replace the letters “ain” with the number 2.

• We’ve reduced the of characters to 30.