100% found this document useful (2 votes)
89 views45 pages

Mmultimedia 3

Multimedia data compression reduces the size of multimedia files to save storage space and transmission bandwidth. It does this by removing statistical and perceptual redundancies. There are two main types of compression: lossless, which allows exact reconstruction of the original data, and lossy, which sacrifices some quality for higher compression ratios. Common lossless techniques include run length encoding, while lossy methods are used for audio and images. Compression algorithms involve tradeoffs between compression ratio, quality loss, and computational requirements.

Uploaded by

iyasu ayenekulu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
89 views45 pages

Mmultimedia 3

Multimedia data compression reduces the size of multimedia files to save storage space and transmission bandwidth. It does this by removing statistical and perceptual redundancies. There are two main types of compression: lossless, which allows exact reconstruction of the original data, and lossy, which sacrifices some quality for higher compression ratios. Common lossless techniques include run length encoding, while lossy methods are used for audio and images. Compression algorithms involve tradeoffs between compression ratio, quality loss, and computational requirements.

Uploaded by

iyasu ayenekulu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

.

Multimedia
Information
Systems (MMIS)

Daniel T.
@DanTeklu
INSY4111 [email protected]
.

Chapter 3

Multimedia Data
compression

INSY4111
The Need for Compression

⚫ Take, for example, a video signal with


resolution 320x240 pixels and 256 (8 bits)
colors, 30 frames per second
– Raw bit rate = 320x240x8x30
= 18,432,000 bits
= 2,304,000 bytes = 2.3 MB
A 90 minute movie would take 2.3x60x90 MB =
12.44 GB
3 MMIS (Comp. by Daniel T.)
Multimedia Data Compression

⚫ Data compression is about finding ways to


reduce the number of bits or bytes used to store
or transmit the content of multimedia data.
– It is the process of encoding information using fewer
bits
– For example, the ZIP file format, which provides
compression, also acts as an archiver, storing many
source files in a single destination output file.

4 MMIS (Comp. by Daniel T.)


Is compression useful?

⚫ It helps reduce the consumption of expensive


resources, such as hard disk space or
transmission bandwidth.
– Save storage space requirement: handy for storing
files as they take up less room.
– Speed up document transmission time: convenient
for transferring files across the Internet, as smaller
files transfer faster.

5 MMIS (Comp. by Daniel T.)


Trade offs in Data Compression

⚫The degree of compression


– To what extent the data should be compressed?
⚫The amount of distortion introduced
– To what extent quality loss is tolerated?
⚫The computational resources required to
compress and decompress the data.
– Dowe have enough memory required for compressing
and decompressing the data?

6 MMIS (Comp. by Daniel T.)


Types of Compression
M M

Compress without Loss Compress with loss

m m
M  M’

Uncompress Uncompress

M M’

M = Multimedia data Transmitted


7 MMIS (Comp. by Daniel T.)
Types of Compression

⚫Lossless Compression
– Lossless compression can recover exact original data
after compression.
– It is used mainly for compressing database records,
spreadsheets, texts, executable programs, etc., where
exact replication of the original data is essential &
changing even a single bit cannot be tolerated.
– Examples: Run Length Encoding, Lempel Ziv (LZ),
Huffman Coding.

8 MMIS (Comp. by Daniel T.)


Cond…

⚫Losslesscompression does not lose any data in


the compression process.
– Lossless compression is possible because most real-
world data has statistical redundancy.
– It packs data into a smaller file size by using a kind of
internal shorthand to signify redundant data.
– This technique can reduce up to half of the original
size.
– WinZip use lossless compression. For this reason zip
software is popular for compressing program & data
files.
9 MMIS (Comp. by Daniel T.)
Cond…

⚫ Lossless compression has advantages and


disadvantages.
– The advantage: compressed file will decompress to an
exact duplicate of the original file, mirroring its
quality.
– The disadvantage: the compression ratio is not all that
high, precisely because no data is lost.

10 MMIS (Comp. by Daniel T.)


Cond…

⚫Lossy Compression
– Result in a certain loss of accuracy in exchange for a
substantial increase in compression.
– For visual & audio data, some loss of quality can be
tolerated without losing the essential nature of the
data where losses outside visual or aural perception
can be tolerated.
⚫ By taking advantage of the limitations of the human sensory
system, a great deal of space can be saved while producing an
output which is nearly indistinguishable from the original.
⚫ In audio compression, for instance, non-audible (or less
audible) components of the signal are removed.

11 MMIS (Comp. by Daniel T.)


Cond…

⚫ Lossy compression is used for:


– Image compression in digital cameras, to increase
storage capacities with minimal degradation of picture
quality
– Audio compression for Internet telephony & CD
ripping, which is decoded by audio players.
– Video compression in DVDs with MPEG format.
– To get a higher compression ratio (i.e. to reduce a file
significantly beyond 50%) you must use lossy
compression.

12 MMIS (Comp. by Daniel T.)


Cond…

⚫Lossy Compression
–A sound file in WAV format, converted to a MP3 file
will lose much data
– MP3 employs a lossy compression; resulting in a file
much smaller so that several dozen MP3 files can fit
on a single storage device, vs. a handful of WAV files.
– However, the sound quality of the MP3 file will be
slightly lower than the original WAV

13 MMIS (Comp. by Daniel T.)


Cond…

⚫An example of lossless vs. lossy compression is


the following string:
– 25.888888888
⚫ This string can be compressed as: 25.98
⚫ Interpreted as, "twenty five point 9 eights", the original
string is perfectly recreated, just written in a smaller form.
– In a lossy system it can be compressed as: 26
⚫ Inwhich case, the original data is lost, at the benefit of a
smaller file size
– The two simplest compression techniques are:
– Zero length suppression & run length encoding.

14 MMIS (Comp. by Daniel T.)


RLE compression technique

⚫In Run-length encoding, large runs of


consecutive identical data values are replaced by
a simple code with the data value and length of
the run, i.e.
(dataValue, LengthOfTheRun)
⚫This encoding scheme tries to tally occurrence of
data value (Xi) along with its run length, i.e.(Xi ,
Length_of_Xi)

15 MMIS (Comp. by Daniel T.)


RLE

⚫It compress data by storing runs of data (that is,


sequences in which the same data value occurs in
many consecutive data elements) as a single data
value & count.
⚫This method is useful on data that contains
many such runs. Otherwise, It is not
recommended for use with files that don't have
many runs as it could potentially double the file
size.
⚫Run-length encoding performs lossless data
compression.
16 MMIS (Comp. by Daniel T.)
RLE

⚫Example:
WWWWWWWWWWBWWWWWWWW
WBBBWWWWWWWWWWWW
⚫If we apply the run-length encoding (RLE) data
compression algorithm, the compressed code is :
– 10W1B9W3B12W (Interpreted as ten W's, one B, nine
W's, three B's, …)
⚫It
is used in fax machines (combined with
Modified Huffman coding).

17 MMIS (Comp. by Daniel T.)


How do you change “MMIS” into bits ?

ASCII Code

18
How do you Write “Æ ¥” in your
keyboard ?

Extended ASCII Code

19 MMIS (Comp. by Daniel T.)


👉 Long Story Short
⚫Differences
– Lossless compression schemes are reversible so that
the original data can be reconstructed,
– Lossy schemes accept some loss of data in order to
achieve higher compression.
⚫These lossy data compression methods typically
offer a three-way tradeoff between
– Computer resource requirement (compression speed,
memory consumption)
– Compressed data size and
– Quality loss.
20 MMIS (Comp. by Daniel T.)
Common compression methods

⚫Statistical methods:
– Itrequires prior information about the occurrence of
symbols
– Estimate probabilities of symbols,
⚫ code one symbol at a time,
⚫ shorter codes for symbols with high probabilities
– Requires two passes:
⚫ One pass to compute probabilities (or frequencies) and
determine the mapping,
⚫ A second pass to encode.
– E.g. Huffman coding and Shannon-Fano coding

21 MMIS (Comp. by Daniel T.)


Common compression methods

⚫Dictionary-based coding
– Do not require prior information to compress strings.
– Don't require a first pass over the data to calculate a
probability model
– All of the adaptive methods are one-pass methods;
only one scan of the message is required.
– Rather, replace symbols with a pointer to dictionary
entries
– Example: Lempel-Ziv (LZ) & Adaptive Huffman
Coding compression techniques

22 MMIS (Comp. by Daniel T.)


Compression Model

⚫Almost all data compression methods involve


the use of a model, a prediction of the
composition of the data.
– When the data matches the prediction made by the
model, the encoder can usually transmit the content of
the data at a lower information cost, by making
reference to the model.
– In most methods the model is separate, and because
both the encoder and the decoder need to use the
model, it must be transmitted with the data.

23 MMIS (Comp. by Daniel T.)


Compression Model

⚫Indictionary coding, the encoder and decoder are


instead equipped with identical rules about how
they will alter their models in response to the
actual content of the data
– Both start with a blank slate, meaning that no initial
model needs to be transmitted.
⚫Asthe data is transmitted, both encoder and
decoder adapt their models, so that unless the
character of the data changes radically, the model
becomes better-adapted to the data it's handling
and compresses it more efficiently.
24 MMIS (Comp. by Daniel T.)
Huffman coding

⚫Developed in 1950s by David Huffman, widely


used for text compression, multimedia codec and
message transmission
⚫Given a set of n symbols and their weights (or
frequencies), construct a tree structure (a binary
tree for binary code) with the objective of
reducing memory space and decoding time per
symbol.
⚫For instance, Huffman coding is constructed
based on frequency of occurrence of letters in
text documents
25 MMIS (Comp. by Daniel T.)
Huffman coding

• Fixed Length
• Variable length

0 1 Code of:
D4 D1 = 000
0 1
D2 = 001
1 D3 D3 = 01
0
D4 = 1
D1 D2
26 MMIS (Comp. by Daniel T.)
Huffman coding

•The Model could determine raw probabilities of


each symbol occurring anywhere in the input
stream.
pi = # of occurrences of Si
Total # of Symbols

27 MMIS (Comp. by Daniel T.)


Huffman coding

⚫The output of the Huffman encoder is


determined by the Model (probabilities).
– The higher the probability of occurrence of the
symbol, the shorter the code assigned to that symbol
and vice versa.
– This will enable to easily control the most frequently
occurring symbols in a data and also reduce the time
taken during decoding each symbols.

28 MMIS (Comp. by Daniel T.)


How to construct Huffman coding

⚫ Step 1: Create forest of trees for each symbol, t1,


t2,… tn
⚫ Step 2: Sort forest of trees according to
increasing probabilities of symbol occurrence
⚫ Step 3: WHILE more than one tree exist DO
– Merge two trees t1 and t2 with least probabilities p1
and p2
– Label their root with sum p1 + p2
– Associate binary code: 1 with the right branch and 0
with the left branch

29 MMIS (Comp. by Daniel T.)


How to construct Huffman coding

⚫Step 4: Create a unique code word for each


symbol by traversing the tree from the root to
the leaf.
– Concatenate all encountered 0s and 1s together during
traversal
⚫The resulting tree has a prob. of 1 in its root and
symbols in its leaf node.

30 MMIS (Comp. by Daniel T.)


Example

⚫ Consider
the following table to construct the
Huffman coding.
VWVWWXYZYZXYZYZYXZYW

Symbol Probability • The Huffman


V 2 encoding algorithm
W 4 picks each time two
X 3
symbols (with the
Y 6
Z 5
smallest frequency) to
31 combine
Word level Exercise

⚫ Given text:
– ABRACADABRA
– MISSISSIPPI
– Construct the Variable Length Huffman coding?

32 MMIS (Comp. by Daniel T.)


The Shannon-Fano Encoding Algorithm

1. Calculate the frequency of each of the symbols


in the list.
2. Sort the list in (decreasing) order of frequencies.
3. Divide the list into two half’s, with the total
frequency counts of each half being as close as
possible to each other.
4. The right half is assigned a code of 0 and the
left half with a code of 1.

33 MMIS (Comp. by Daniel T.)


The Shannon-Fano Encoding Algorithm

5. Recursively apply steps 3 and 4 to each of the


halves, until each symbol has become a
corresponding code leaf on the tree.
✓ That is, treat each split as a list and apply splitting and
code assigning till you are left with lists of single
elements.
6. Generate codeword for each symbol

34 MMIS (Comp. by Daniel T.)


Examples

⚫Example: Given five symbols A to E with their


frequencies being 15, 7, 6, 6 & 5; encode them
using Shannon-Fano encoding

⚫Solution:

Symbol A B C D E
Count 15 7 6 6 5
0 0 1 1 1
0 1 0 1 1
0 1

35 MMIS (Comp. by Daniel T.)


Cond…

Symbol Count Code Number of


Bits

A 15 00 30
B 7 01 14
C 6 10 12
D 6 110 18
E 5 111 15
89

36 MMIS (Comp. by Daniel T.)


Exercise

⚫Given the following symbols and their


corresponding frequency of occurrence, find an
optimal binary code for compression:
Character: a b c d e t

Frequency: 16 5 12 17 10 25
A. Using Shannon-Fano coding scheme
B. Using the Huffman algorithm/coding
37 MMIS (Comp. by Daniel T.)
Lempel-Ziv (LZ) compression

⚫Not rely on previous knowledge about the data


⚫Rather builds this knowledge in the course of
data transmission/data storage
⚫Lempel-Ziv algorithm uses a table of code-words
created during data transmission;
– Each time it replaces strings of characters with a
reference to a previous occurrence of the string.
⚫The multi-symbol patterns are of the form:
C0C1 . . . Cn-1 Cn. The prefix of a pattern
consists of all the pattern symbols except the last:
C0C1 . . . Cn-1
38 MMIS (Comp. by Daniel T.)
Lempel-Ziv (LZ) compression

⚫Output: there are three options in assigning a


code to each symbol in the list
– If one-symbol pattern is not in dictionary, assign (0,
symbol)
– If multi-symbol pattern is not in dictionary, assign
(dictionaryPrefixIndex, lastPatternSymbol)
– If the last input symbol or the last pattern is in the
dictionary, assign (dictionaryPrefixIndex, )

39 MMIS (Comp. by Daniel T.)


Lempel-Ziv (LZ) compression

⚫Encode (i.e., compress) the string


ABBCBCABABCAABCAAB using the LZ
algorithm.

⚫Thecompressed message is:


(0,A)(0,B)(2,C)(3,A)(2,A)(4,A)(6,B)
40 MMIS (Comp. by Daniel T.)
Example: Compute Number of bits
transmitted
⚫Consider the string ABBCBCABABCAABCAAB
given in example 2 (previous slide) and compute
the number of bits transmitted:
– Number of bits = Total No. of characters * 8 = 18 * 8 =
144 bits
⚫The compressed string consists of codewords and
the corresponding codeword index as shown below:
– Codeword: (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)
– Codeword index: 1 2 3 4 5 6 7
– The actual compressed message is: 0A 0B 10C 11A 10A
100A 110B where each character is replaced by its binary
8-bit ASCII code. (See on the Next Slide)
41 MMIS (Comp. by Daniel T.)
Example: Compute Number of bits
transmitted

⚫Each code word consists of a character and an


integer:
– The character is represented by 8 bits
– The number of bits n required to represent the integer part
of the codeword with index i is given by:

CW: (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)


Index: 1 2 3 4 5 6 7
Bits: (1 + 8) + (1 + 8) + (2 + 8) + (2 + 8) + (2 + 8) + (3 + 8) + (3 + 8) =
70 bits

42 MMIS (Comp. by Daniel T.)


Example: Decompression

⚫Decode (i.e., decompress) the sequence (0, A) (0, B)


(2, C) (3, A) (2, A) (4, A) (6, B)

The decompressed message is:


ABBCBCABABCAABCAAB
43 MMIS (Comp. by Daniel T.)
Exercise

⚫Encode(i.e., compress) the following strings using


the Lempel-Ziv algorithm.

– Aaababbbaaabaaaaaaabaabb
– SATATASACITASA.

44 MMIS (Comp. by Daniel T.)


45

You might also like