0% found this document useful (0 votes)
23 views13 pages

Chapter 2

Uploaded by

naimabendadi67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views13 pages

Chapter 2

Uploaded by

naimabendadi67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter 2 :

Entropy Coding
Coding and compression: II) Entropy coding

3) Entropy coding
Is a type of lossless source coding, also called variable length statistical coding. This name
comes from the fact that the coding process uses the statistical properties of the source,
where it assigns the shortest code words to the most frequent symbols.

Encoders’ evaluation criteria


When choosing or evaluating a source encoder, we consider various criteria that assess its
effectiveness in representing the information. Here, we cite some of the most important ones:
a) The code rate : knowns also as the average code length, is the average number of bits per
symbol or codeword. It is defined as :
, where is the length of the code associated with the symbol
b) Code efficiency : The efficiency of a C code for a source S of entropy H is defined as:

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

c) Redundancy of a code : is the difference between the entropy of the source H(X) and
the code rate. ρ=. It can be defined also as , where it represents the percentage of
additional bits compared to an optimal code.
d) variance of a code : is calculated as

Some types of encoders

1) Shannon-Fano encoder : follows an entropy coding process that verifies the prefix
condition. The compression/decompression is acheived according to a tree as follows :

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

• Rank the symbols’ probabilities (like states) in descending order


• Divide the ranked symbols into two subsets, ensuring that the difference in total probability
between them is as small as possible.
• Assign “ 0” to symbols in the first subset and a 1 to of symbols in the second subset.
• Repeat the set and subsets division step and concatenate “0” or “1” to the symbols code each
subset contains only one symbol.

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding
2) Huffman encoder :
like the previous coding process, the probabilities of the symbols are sorted in descending order.
Then the construction of a tree structure. The principle consists in grouping the 2 symbols of low
probability (weight) then creating a new node of weight equal to the sum of the grouped weights.
This step is repeated until all symbols are grouped together. At the end the right node receives “1”
and the other receives “0” the code then reads from top to bottom.

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

The problem with SHANNON-FANO and HUFFMAN encoders lies in the allocation of an
integer number of bits for each code, which is not always optimal . For example, if the

around 0.11 bits ( 𝑛 = I ⟹ 𝑛 = − 𝑙𝑜𝑔 2 𝑃𝑖 ). Huffman encoder will assign either 1 or 2 bits
probability of a symbol is 0.9, the optimal number of bits to encode this character is

to this symbol, which is much longer than the possible theoretical value!!. This is why
other process of coding are proposed:

3) arithmetic code
The arithmetic coding process does not replace a symbol with a specific code like the
case of Huffman encoder but it replaces a stream of symbols with a single floating point
number. The output of this coding process is a Number [0,1[.

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

With this method, we actually use partitions of an interval [a, b] (initially [0,1]), the size
of each sub-interval is proportional to the frequency of the symbol corresponding to it.
Each symbol to be compressed reduces the current interval [a, b] to the sub-interval
[a', b'] corresponding to it, the latter is itself partitioned and then undergoes the same
processing as [a, b]. Thus, we finally obtain a very small interval [A, B] including the
code value.
The calculation of the bounds of each interval is as follows
Coding
Low = 0.0; High = 1.0; Decoding
While (C = New character ) Number = input code
Begin For symbol = Find_symbol (which is in this range);
Range = High-Low; Range = High_Range (Symbol)-LowRange(Symbol);
High = Low + Range * High_Range (C);; Number = Number - Low_Range (Symbol);
Low=Low+Range* Low_Range (C); Number = Number/Range;
End

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

EXP: encode then decode the message: BILL GATES


This message will be coded by a code which belongs to the interval
[ 0.25721167752 , 0.2572167756 [
See the following partitioning , encoding and decoding tables

T S I I G E B A Space Character
1/10 1/10 2/10 1/10 1/10 1/10 1/10 1/10 1/10 Probability

0.9 ≤ x < 0.8 ≤ x < 0.6 ≤ x < 0.5 ≤ x < 0.4 ≤ x < 0.3 ≤ x < 0.2 ≤ x < 0.1 ≤ x < 0≤x< Interval
1 0.9 0.8 0.6 0.5 0.4 0.3 0.2 0.1

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

coding

decoding

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

Adaptive Codes
The compression methods used so far use a statistical model to encode unique
symbols. They perform the compression by encoding the symbols into bit strings that
use fewer bits than the original symbols. The quality of the compression increases or
decreases depending on the program's ability to develop a good model. Moreover the
model must accurately predict the probabilities of the symbols, which is not always
feasible.
Adaptive codes are more desirable coding algorithms for data streaming as they adapt
to localized changes in symbols.
They start with or without minimal dictionary of each symbol's codes and update it for
each new character.
1) Adaptive Huffman Coding : see the additional document

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

2) LZW

An improvement of the LZ77 and LZ78 codes Created in 1984 by Terry Welsh. This
Algorithm the existance of an initial dictionary comprising all the unitary symbols of the
message.
So, it starts with a dictionary of all single characters with indexes starting at 1. Then it
upgrade the dictionary as it processes the text, When, a new character is read a new code
is generated. The compression algorithm follows the steps below:

1) Read the longest serie of consecutive symbols “x” present in the dictionary.
2) Write the index of "x" in the output file.
3) Read the symbol “i” that follows “x”.
4) Add "xi" to the dictionary.
5) Repeat this algorithm until you get i=empty.

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


Coding and compression: II) Entropy coding

Example : un_ver_vert_goes_to_un_verre_vert
Initial Dictionary
u v t s r n e a _ Symbol
8 7 6 5 4 3 2 1 0 index
Coding
_vert e_ re _verr un_ _u s_ rs _ver a_ _va t_ ert _ve r_ er ve _u n_ um Symbol
28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 index
20 2 4 20 9 0 5 4 15 1 11 6 13 11 4 2 7 0 3 8 coded

Final code: 8 3 0 7 2 4 11 13 6 11 1 15 4 5 0 9 20 4 2 20 6

In the decoding process, we decode from the dictionary, where the latter is updated by the
characters corresponding to the preceding code + the 1st symbol of the current string.

Dr BETTAYEB Nadjla University of Kasdi Merbah , Ouargla, February 2024


References

1. Souad.Berhab , coding and compression , M1 Telecom course , Kasdi University Merbah ,


Ouargla, 2019-2020.
2. Yamina.Bekri , information theory and coding , course for Telecommunications modern ,
University of Mascara, 2015.
3. Thomas, MTCAJ and Joy, A. Thomas. Elements of information theory . Wiley- Interscience ,
2006.
4. Mvogo Ngono Joseph , General principles of source entropy coding , Image Compression
course for Master II: IASIG
5. Merouane Bouzid, Joint Source and Channel Coding for Noisy Channel Transmissions ,
Doctoral Thesis in Electronics, USTHB, 2006.
6. https://www.youtube.com/watch?v=znqMMOZ_Brs
7. https://www.youtube.com/watch?v=lV6L5O8sL8w

Dr BETTAYEB Nadjla Kasdi University Merbah , Ouargla February 2024

You might also like