Understanding Arithmetic Coding

Arithmetic coding is an efficient lossless data compression technique that encodes data into a single number between 0 and 1. It works by assigning variable-length bit sequences to symbols based on their probability of occurrence. It typically achieves better compression than Huffman coding by encoding the entire message into a single number rather than separate codewords. The arithmetic coding algorithm works by recursively dividing the interval [0,1) into smaller subintervals based on symbol probabilities. While more complex than Huffman coding, arithmetic coding offers improved compression for certain data types.

Uploaded by

miraclesuresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

359 views5 pages

Understanding Arithmetic Coding

Uploaded by

miraclesuresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Arithmetic Coding

Arithmetic coding is the most efficient method to code symbols according to the probability
of their occurrence. The average code length corresponds exactly to the possible minimum
given by information theory. Deviations which are caused by the bit-resolution of binary
code trees do not exist.

In contrast to a binary Huffman code tree the arithmetic coding offers a clearly better
compression rate. Its implementation is more complex on the other hand.

In arithmetic coding, a message is encoded as a real number in an interval from one to zero.
Arithmetic coding typically has a better compression ratio than Huffman coding, as it
produces a single symbol rather than several separate codewords.

Arithmetic coding differs from other forms of entropy encoding such as Huffman coding in
that rather than separating the input into component symbols and replacing each with a code,
arithmetic coding encodes the entire message into a single number, a fraction n where (0.0
n < 1.0)

Arithmetic coding is a lossless coding technique. There are a few disadvantages of
arithmetic coding. One is that the whole codeword must be received to start decoding the
symbols, and if there is a corrupt bit in the codeword, the entire message could become
corrupt. Another is that there is a limit to the precision of the number which can be
encoded, thus limiting the number of symbols to encode within a codeword. There also
exist many patents upon arithmetic coding, so the use of some of the algorithms also call
upon royalty fees.

Arithmetic coding is part of the JPEG data format. Alternative to Huffman coding it will be
used for final entropy coding. In spite of its less efficiency Huffman coding remains the
standard due to the legal restrictions mentioned above.

Arithmetic Coding Algorithm:
The arithmetic coding algorithm works from leaves to the root in the opposite direction.
1. Start with an interval [0, 1), divided into subintervals of all possible symbols to appear
within a message. Make the size of each subinterval proportional to the frequency at
which it appears in the message.
2. When encoding a symbol, "zoom" into the current interval, and divide it into
subintervals like in step one with the new range.
3. Repeat the process until the maximum precision of the machine is reached, or all
symbols are encoded.
4. Transmit some number within the latest interval to send the codeword. The number
of symbols encoded will be stated in the protocol of the image format.
Example 1:
The source of information A generates the symbols {A0, A1, A2, A3 and A4} with the
corresponding probabilities {0.4, 0.3, 0.2 and 0.1}. Encoding the source symbols using
Huffman encoder gives:
Source Symbol P
i
Binary Code Huffman
A0 0.4 00 0
A1 0.3 01 10
A2 0.2 10 110
A3 0.1 10 111
L
avg
H = 1.846 2 1.9

The Entropy of the source is

Since we have 4 symbols (4=2
2
), we need 2 bits at least to represent each symbol in binary
(fixed-length code). Hence the average length of the binary code is

Thus the efficiency of the binary code is

The average length of the Huffman code is

Thus the efficiency of the Huffman code is

The Huffman encoder has the closest efficiency to the entropy that can be obtained using a
prefix code. Higher efficiency can be yielded with the arithmetic coding.

Dividing into Intervals
On the basis of a well-known alphabet the probability of all symbols has to be determined
and converted into intervals. The size of the interval depends linearly on the symbol's
probability. If this is 50% for example, then the associated sub-interval covers the half of the
current interval. Usually the initial interval is [0; 1) for the encoding of the first symbol.
Source Symbol P
i
Sub-interval
A0 0.4 [0.0;0.4)
A1 0.3 [0.4;0.7)
A2 0.2 [0.7;0.9)
A3 0.1 [0.9;1.0)

Assume that the message to be encoded is A0A0A3A1A2. The first symbol to be encoded is
A0. We "zoom" into the interval corresponding to "A0", and divide up that interval into
smaller subintervals like before. We now use this new interval as the basis of the next
symbol encoding step.

Source Symbol New A0 Interval
A0 [0.0;0.16)
A1 [0.16;0.28)
A2 [0.28;0.36)
A3 [0.36;0.4)

To encode the next character "A0", we use the "A0" interval created before, and zoom into
the subinterval "A0", and use that for the next step. This produces

Source Symbol New A0 Interval
A0 [0.0;0.064)
A1 [0.064;0.112)
A2 [0.112;0.144)
A3 [0.144;0.16)

To encode the next character "A3", we use the "A0" interval created before, and zoom into
the subinterval "A3", and use that for the next step. This produces

Source Symbol New A3 Interval
A0 [0.144;0.1504)
A1 [0.1504;0.1552)
A2 [0.1552;0.1584)
A3 [0.1584;0.16)

And lastly, the final result is

Source Symbol New A0 Interval
A0 [0.1504;0.15232)
A1 [0.15232;0.15376)
A2 [0.15376;0.15472)
A3 [0.15472;0.1552)

Transmit some number within the latest interval to send the codeword. The number of
symbols encoded will be stated in the protocol of the image format, so any number within
[0.15376, 0.15472) will be acceptable.

Lets choose the number 0.1543. The binary representation of this number is 0.001001111.
We need 10 bits to encode the message (9 bits and the floating point). The minimum number
of bits needed to fully encode the message is

H*N = 1.846*5= 9.23 bit

Using Huffman code, the message is encoded to 0 0 111 10 110 which need also 10 bits. The
larger is the number of symbols, the wider is the gap in efficiency.

Decoding the code is a reverse approach. Lets assume the number 0.1543 has been received
at the decoder. 0.1543 lies in the interval [0; 0.4), then the first symbol of the message is A0.

Then, we "zoom" into the interval corresponding to "A0", and divide up that interval into
smaller subintervals. We now use this new interval as the basis of the next symbol decoding
step.

Source Symbol New A0 Interval
A0 [0.0;0.16)
A1 [0.16;0.28)
A2 [0.28;0.36)
A3 [0.36;0.4)

0.1543 lies in the interval [0; 0.16), then the second symbol of the message is A0. Zoom into
the subinterval "A0", and use that for the next step. This produces

Source Symbol New A0 Interval
A0 [0.0;0.064)
A1 [0.064;0.112)
A2 [0.112;0.144)
A3 [0.144;0.16)

0.1543 lies in the interval [0.144; 0.16), then the third symbol of the message is A3. Zoom
into the subinterval "A3", and use that for the next step. This produces

Source Symbol New A3 Interval
A0 [0.144;0.1504)
A1 [0.1504;0.1552)
A2 [0.1552;0.1584)
A3 [0.1584;0.16)

0.1543 lies in the interval [0.1504; 0.1552), then the fourth symbol of the message is A1.
Zoom into the subinterval "A1", and use that for the next step. This produces

Source Symbol New A0 Interval
A0 [0.1504;0.15232)
A1 [0.15232;0.15376)
A2 [0.15376;0.15472)
A3 [0.15472;0.1552)

0.1543 lies in the interval [0.15376; 0.15472), then the last symbol of the message is A2. The
decoder stops after this as the number of symbols encoded will be stated in the protocol of the
message.

Exercise 1:
The source of information A generates the symbols {A0, A1, A2, A3 and A4} with the
probabilities shown in the table below. Encode the source symbols using Arithmetic encoder
and Huffman encoder. The message is A4A1A0A3A2
Source Symbol P
i
A0 0.4
A1 0.4
A2 0.12
A3 0.06
A4 0.02

Compare the efficiency of both codes and comment on the results.

Exercise 2:
The source of information A generates the symbols {A0, A1, A2 and A3} with the
probabilities shown in the table below. Encode the source symbols arithmetic encoder if the
message is A2A0A2A3
Source Symbol P
i
A0 0.5
A1 0.3
A2 0.2

Compare the efficiency of both codes and comment on the results.

Arithmetic Coding Explained
No ratings yet
Arithmetic Coding Explained
12 pages
Arithmetic Coding Techniques Explained
No ratings yet
Arithmetic Coding Techniques Explained
22 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
6 pages
Truncated Huffman
No ratings yet
Truncated Huffman
5 pages
Source Coding
No ratings yet
Source Coding
18 pages
Huffman Coding in Image Processing
No ratings yet
Huffman Coding in Image Processing
12 pages
Arithmetic Coding: Presented By: Einat & Kim
No ratings yet
Arithmetic Coding: Presented By: Einat & Kim
48 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
15 pages
Arithmetic Coding (Float Binary) Leangroup Org
No ratings yet
Arithmetic Coding (Float Binary) Leangroup Org
49 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
30 pages
GSM Network and Services: Channel Coding - From Source Data To Radio Bursts
100% (1)
GSM Network and Services: Channel Coding - From Source Data To Radio Bursts
21 pages
Image Compression Coding Schemes
50% (4)
Image Compression Coding Schemes
96 pages
Huffman Coding for Tech Enthusiasts
No ratings yet
Huffman Coding for Tech Enthusiasts
5 pages
Python OOP: Classes and Inheritance
No ratings yet
Python OOP: Classes and Inheritance
21 pages
Huffman Coding Algorithm Guide
No ratings yet
Huffman Coding Algorithm Guide
54 pages
Information Theory and Coding Sample Question 2021
No ratings yet
Information Theory and Coding Sample Question 2021
5 pages
Huffman Coding Notes
No ratings yet
Huffman Coding Notes
7 pages
Huffman Coding
No ratings yet
Huffman Coding
10 pages
Shannon Fano Solved Examples
No ratings yet
Shannon Fano Solved Examples
4 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
226 pages
Understanding Predictive Coding Techniques
No ratings yet
Understanding Predictive Coding Techniques
14 pages
Turbo Codes for Engineers
No ratings yet
Turbo Codes for Engineers
37 pages
Source Coding Techniques Overview
No ratings yet
Source Coding Techniques Overview
111 pages
Algebraic Coding Theory MA 407: 1 Introduction and Motivation
No ratings yet
Algebraic Coding Theory MA 407: 1 Introduction and Motivation
12 pages
Understanding Arithmetic Coding Basics
No ratings yet
Understanding Arithmetic Coding Basics
26 pages
A-Level Guide: Compression, Encryption, Hashing
No ratings yet
A-Level Guide: Compression, Encryption, Hashing
61 pages
Floating Point Representation Latest by MR Saem
No ratings yet
Floating Point Representation Latest by MR Saem
69 pages
Huffman Coding: Greedy Algorithm Guide
No ratings yet
Huffman Coding: Greedy Algorithm Guide
27 pages
31150-Introduction To Python v2.1
No ratings yet
31150-Introduction To Python v2.1
42 pages
DCR New Question Bank For Gtu
No ratings yet
DCR New Question Bank For Gtu
7 pages
Algorithm Design and Problem Solving
No ratings yet
Algorithm Design and Problem Solving
85 pages
5.3 Kraft Inequality and Optimal Codeword Length: Theorem 22 Let X
No ratings yet
5.3 Kraft Inequality and Optimal Codeword Length: Theorem 22 Let X
11 pages
Error Control Coding: Block Codes Explained
No ratings yet
Error Control Coding: Block Codes Explained
76 pages
Shannon's Source Coding Overview
No ratings yet
Shannon's Source Coding Overview
48 pages
Huffman Code (Variable Length)
No ratings yet
Huffman Code (Variable Length)
19 pages
Turbo Codes Tutorial Guide
No ratings yet
Turbo Codes Tutorial Guide
21 pages
Data Link Layer:: Error Detection and Correction
No ratings yet
Data Link Layer:: Error Detection and Correction
51 pages
Information Theory & Coding Guide
No ratings yet
Information Theory & Coding Guide
217 pages
Application Development Using Python: Dept. of CSE, DSATM 2020-21 1
100% (1)
Application Development Using Python: Dept. of CSE, DSATM 2020-21 1
9 pages
Python Data Structures: Tuples Explained
No ratings yet
Python Data Structures: Tuples Explained
21 pages
Hadamard Code for Error Correction
100% (1)
Hadamard Code for Error Correction
10 pages
Chapter 1 Workbook Answers
No ratings yet
Chapter 1 Workbook Answers
8 pages
LDPC Codes for Engineers
No ratings yet
LDPC Codes for Engineers
40 pages
Lecture#8 2
No ratings yet
Lecture#8 2
44 pages
Wireless Networks Homework Guide
No ratings yet
Wireless Networks Homework Guide
2 pages
Digital Data Comm Techniques
No ratings yet
Digital Data Comm Techniques
43 pages
Cyclic Codes. Detailed Solutions To Problems
No ratings yet
Cyclic Codes. Detailed Solutions To Problems
12 pages
Python File Handling Basics
No ratings yet
Python File Handling Basics
1 page
Information Theory & Coding Techniques-DCom
No ratings yet
Information Theory & Coding Techniques-DCom
28 pages
Lecture 9 Python
No ratings yet
Lecture 9 Python
8 pages
Unit Iv Linear Block Codes: Channel Encoder
No ratings yet
Unit Iv Linear Block Codes: Channel Encoder
26 pages
DC Unit3
No ratings yet
DC Unit3
97 pages
Source Coding Theorem Explained
No ratings yet
Source Coding Theorem Explained
27 pages
HTTP Notes
No ratings yet
HTTP Notes
28 pages
Understanding Algorithm Types
No ratings yet
Understanding Algorithm Types
58 pages
Line Encoding
No ratings yet
Line Encoding
45 pages
DDA Algorithm 1
No ratings yet
DDA Algorithm 1
16 pages
Data Compression Unit III
No ratings yet
Data Compression Unit III
23 pages
Data Compression Unit III
No ratings yet
Data Compression Unit III
22 pages
Data Compression with Arithmetic Coding
No ratings yet
Data Compression with Arithmetic Coding
11 pages
Wireless Audit Using NetStumbler
0% (1)
Wireless Audit Using NetStumbler
3 pages
Android GUI Lab Manual Instructions
67% (3)
Android GUI Lab Manual Instructions
24 pages
Android GUI Lab Manual Instructions
67% (3)
Android GUI Lab Manual Instructions
24 pages
Computer Practice Lab Manual for IT Students
No ratings yet
Computer Practice Lab Manual for IT Students
101 pages
Stop-and-Wait Protocol in NS2
No ratings yet
Stop-and-Wait Protocol in NS2
7 pages
Computer Practices Lab Syllabus
No ratings yet
Computer Practices Lab Syllabus
2 pages
C Programming: Compiler Techniques
80% (5)
C Programming: Compiler Techniques
60 pages
Syllabus
No ratings yet
Syllabus
2 pages
DAA 2 Marks Q&A Overview
No ratings yet
DAA 2 Marks Q&A Overview
14 pages
Android GUI Lab Manual Instructions
67% (3)
Android GUI Lab Manual Instructions
24 pages
C Programming: Compiler Techniques
80% (5)
C Programming: Compiler Techniques
60 pages
CD Lab Manual
100% (1)
CD Lab Manual
55 pages
DAA 2marks With Answers
No ratings yet
DAA 2marks With Answers
11 pages
GPA Calculator - Anna University
No ratings yet
GPA Calculator - Anna University
6 pages
Compiler Lab Manual 2011-12
100% (1)
Compiler Lab Manual 2011-12
30 pages
DNS Functions for Name Resolution
0% (1)
DNS Functions for Name Resolution
30 pages
Compiler Design & Networks Lab Manual
No ratings yet
Compiler Design & Networks Lab Manual
69 pages
Web Lab Manual
No ratings yet
Web Lab Manual
4 pages
Algorithm Design & Analysis Guide
No ratings yet
Algorithm Design & Analysis Guide
8 pages
IO Multiplexing
100% (1)
IO Multiplexing
57 pages
Request for Alternate Holiday Approval
No ratings yet
Request for Alternate Holiday Approval
1 page
Medical Imaging Thesis
No ratings yet
Medical Imaging Thesis
2 pages
Solutions Manual
91% (11)
Solutions Manual
260 pages
Understanding C++ STL Components
No ratings yet
Understanding C++ STL Components
24 pages
Project Report Preparation Guidelines For PG
No ratings yet
Project Report Preparation Guidelines For PG
12 pages
Socket Programming Fundamentals
No ratings yet
Socket Programming Fundamentals
69 pages
PR PPT 1
No ratings yet
PR PPT 1
29 pages
Probability Distributions Overview
No ratings yet
Probability Distributions Overview
29 pages
Emaar Group Strategic Proposal
No ratings yet
Emaar Group Strategic Proposal
4 pages
First Toa Payoh Primary Updates
No ratings yet
First Toa Payoh Primary Updates
7 pages
Intercambio de Retenedoras
No ratings yet
Intercambio de Retenedoras
1 page
INGLES 11deg
No ratings yet
INGLES 11deg
3 pages
Gmail - How To Remember Everything You Read
No ratings yet
Gmail - How To Remember Everything You Read
16 pages
Reviews, Refinements and New Ideas in Face Recognition (Port8zero)
No ratings yet
Reviews, Refinements and New Ideas in Face Recognition (Port8zero)
338 pages
Simplex Basket Strainers
No ratings yet
Simplex Basket Strainers
36 pages
"Details of Faculty Members of Engineering & Technology Departments
No ratings yet
"Details of Faculty Members of Engineering & Technology Departments
5 pages
Patterns and Numbers in Nature and The World
No ratings yet
Patterns and Numbers in Nature and The World
8 pages
Syllabus: Cambridge IGCSE® Drama 0411
No ratings yet
Syllabus: Cambridge IGCSE® Drama 0411
33 pages
Radio Communications Phraseology and Techniques (P-8740-47)
No ratings yet
Radio Communications Phraseology and Techniques (P-8740-47)
16 pages
Analyzing Trauma in Film Studies
No ratings yet
Analyzing Trauma in Film Studies
39 pages
SFP Colour CODES PDF
No ratings yet
SFP Colour CODES PDF
17 pages
Lesson Plan in English 6 Q2 WK7
No ratings yet
Lesson Plan in English 6 Q2 WK7
6 pages
MSc Materials Science at Birmingham
No ratings yet
MSc Materials Science at Birmingham
4 pages
Atrium Ventilation Guidelines
50% (2)
Atrium Ventilation Guidelines
13 pages
Bank Statement for Arun Kumar Manivannan
No ratings yet
Bank Statement for Arun Kumar Manivannan
14 pages
Valuation of Cold Storage Plant R Jayaraman F.I.V.: Registered Valuer (TN)
No ratings yet
Valuation of Cold Storage Plant R Jayaraman F.I.V.: Registered Valuer (TN)
25 pages
Aashto T 27 T 11 2012
No ratings yet
Aashto T 27 T 11 2012
18 pages
5G Backhaul: Requirements, Challenges, and Emerging Technologies
No ratings yet
5G Backhaul: Requirements, Challenges, and Emerging Technologies
12 pages
Vent-Free Gas Stove Manual
No ratings yet
Vent-Free Gas Stove Manual
28 pages
LINUX
100% (1)
LINUX
307 pages
Avogadro's Constant and Mole Calculations
No ratings yet
Avogadro's Constant and Mole Calculations
3 pages
Annual Report 2022-23 (English)
No ratings yet
Annual Report 2022-23 (English)
406 pages
Modificaciones Mejoradas Jeep
No ratings yet
Modificaciones Mejoradas Jeep
427 pages
Model PPS Pulse Transmitter: General
100% (1)
Model PPS Pulse Transmitter: General
4 pages
VHF Direction Finder Specs
No ratings yet
VHF Direction Finder Specs
2 pages
Dream11's PowerPlay Strategy
No ratings yet
Dream11's PowerPlay Strategy
10 pages

Understanding Arithmetic Coding

Uploaded by

Understanding Arithmetic Coding

Uploaded by

Arithmetic Coding

You might also like