0% found this document useful (0 votes)

21 views30 pages

4.4. Arithmetic Coding

Arithmetic coding has several advantages over other compression techniques: 1) It can reach the entropy limit of the data, performing better than Huffman coding for small alphabets and skewed distributions. 2) It separates the modeling of the data from the coding, allowing for adaptive compression as the model updates. 3) It is computationally efficient and well-suited for one-pass, online compression where portions of the encoded data can be transmitted incrementally before the full encoding is complete.

Uploaded by

perhacker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views30 pages

4.4. Arithmetic Coding

Uploaded by

perhacker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

4.4.

Arithmetic coding

Advantages:
Reaches the entropy (within computing precision)
Superior to Huffman coding for small alphabets and
skewed distributions
Clean separation of modelling and coding
Suits well for adaptive one-pass compression
Computationally efficient

History:
Original ideas by Shannon and Elias
Actually discovered in 1976 (Pasco; Rissanen)

SEAC-4 J.Teuhola 2014 71

Arithmetic coding (cont.)
Characterization:
One codeword for the whole message
A kind of extreme case of extended Huffman (or Tunstall) coding
No codebook required
No clear correspondence between source symbols and code bits

Basic ideas:
Message is represented by a (small) interval in [0, 1)
Each successive symbol reduces the interval size
Interval size = product of symbol probabilities
Prefix-free messages result in disjoint intervals
Final code = any value from the interval
Decoding computes the same sequence of intervals

SEAC-4 J.Teuhola 2014 72

Arithmetic coding: Encoding of ”BADCAB”
1
D
0.9
C
0.7
0.7
B 0.52 0.52
D etc.
A
0.508
0.4
0.4 0.4

SEAC-4 J.Teuhola 2014 73

Encoding of ”BADCAB” with rescaled intervals
1.0 0.7 0.52 0.52 0.5188 0.51736
D D D D D D

C C C C C C
0.517072

B B B B B B

0.516784

A A A A A A

0.0 0.4 0.4 0.508 0.5164 0.5164

SEAC-4 J.Teuhola 2014 74

Algorithm: Arithmetic encoding
Input: Sequence x = xi, i=1, ..., n; probabilities p1, ..., pq of symbols 1, ..., q.
Output: Real value between [0, 1) that represents X.
begin
cum[0] := 0
for i := 1 to q do cum[i] := cum[i−1] + pi
lower := 0.0
upper := 1.0
for i := 1 to n do
begin range := upper − lower
upper := lower + range ∗ cum[xi]
lower := lower + range ∗ cum[xi−1]
end
return (lower + upper) / 2
end

SEAC-4 J.Teuhola 2014 75

Algorithm: Arithmetic decoding
Input: v: Encoded real value; n: number of symbols to be decoded;
probabilities p1, ..., pq of symbols 1, ..., q.
Output: Decoded sequence x.
begin
cum[1] := p1
for i := 2 to q do cum[i] := cum[i−1] + pi
lower := 0.0
upper := 1.0
for i := 1 to n do
begin range := upper − lower
z := (v − lower) / range
Find j such that cum[j−1] ≤ z < cum[j]
xi := j
upper := lower + range ∗ cum[j]
lower := lower + range ∗ cum[j−1]
end
return x = x1, ..., xn
end
SEAC-4 J.Teuhola 2014 76
Arithmetic coding (cont.)

Practical problems to be solved:

Arbitrary-precision real arithmetic
The whole message must be processed before the first
bit is transferred and decoded.
The decoder needs the length of the message

Representation of the final binary code:

Midpoint between lower and upper ends of the final
interval.
Sufficient number of significant bits, to make a distinction
from both lower and upper.
The code is prefix-free among prefix-free messages.

SEAC-4 J.Teuhola 2014 77

Example of code length selection

midpoint ≠ lower and upper

upper: 0.517072 = .10000100010111101...

midpoint: 0.516928 = .10000100010101010...
lower: 0.516784 = .10000100010010111...

13 bits
range = 0.00028
log2(1/range) ≈11.76 bits

SEAC-4 J.Teuhola 2014 78

Another source message

“ABCDABCABA”
Precise probabilities:
P(A) = 0.4, P(B) = 0.3, P(C) = 0.2, P(D) = 0.1

Final range length:

0.4 ⋅ 0.3 ⋅ 0.2 ⋅ 0.1 ⋅ 0.4 ⋅ 0.3 ⋅ 0.2 ⋅ 0.4 ⋅ 0.3 ⋅ 0.4 =
0.44 ⋅ 0.33 ⋅ 0.22 ⋅ 0.1 = 0.000002764

-log20.000002764 ≈ 18.46 = entropy

SEAC-4 J.Teuhola 2014 79

Arithmetic coding: Basic theorem

Theorem 4.2.
Let range = upper − lower be the final probability
interval in Algorithm 4.8. The binary
representation of mid = (upper + lower) / 2
truncated to l(x) = ⎡log2(1/range)⎤ + 1 bits is a
uniquely decodable code for message x among
prefix-free messages.

Proof: Skipped.

SEAC-4 J.Teuhola 2014 80

Optimality
Expected length of an n-symbol message x:
L( n ) = ∑ P( x)l ( x)
⎡⎡ 1 ⎤ ⎤
= ∑ P ( x ) ⎢ ⎢ log 2 ⎥ + 1⎥
⎣⎢ P ( x ) ⎥ ⎦
⎡ 1 ⎤
≤ ∑ P ( x ) ⎢ log 2 + 2⎥
⎣ P (x ) ⎦
1
= ∑ P ( x ) log 2 + 2∑ P ( x )
P (x)
= H ( S (n) ) + 2
Bits per symbol:
H (x ( n ) ) H (x ( n ) ) 2
≤ L≤ +
n n n
2
H (S ) ≤ L ≤ H (S ) +
n
SEAC-4 J.Teuhola 2014 81
Ending problem

The above theorem holds only for prefix-free messages.

The ranges of a message and its prefix overlap, and
may result in the same code value.
How to distinguish between “VIRTA” and “VIRTANEN”?
Solutions:
Transmit the length of the message before the message itself:
“5VIRTA” and “8VIRTANEN”.
This is not good for online applications.

Use a special end-of-message symbol, with prob = 1/n where n

is an estimated length of the message.
Good solution unless n is totally wrong.

SEAC-4 J.Teuhola 2014 82

Arithmetic coding: Incremental transmission
Bits are sent as soon as they are known.
Decoder can start well before the encoder has finished.
The interval is scaled (zoomed) for each output bit:
Multiplication by 2 means shifting the binary point one
position to the right:

upper: 0.011010… 0.11010…

and transmit 0
lower: 0.001101… 0.01101…

upper: 0.110100… 0.10100… and transmit 1

lower: 0.100011… 0.00011…

Note: The common bit also in midpoint value.

SEAC-4 J.Teuhola 2014 83
Arithmetic coding: Scaling situations
// Number p of pending bits initialized to 0
upper < 0.5: 1 1
transmit bit 0 (plus p pending 1’s)
lower := 2 ⋅ lower 0.5 0.5
upper := 2 ⋅ upper
0 0
lower > 0.5 1 1
transmit bit 1 (plus p pending 0’s)
lower := 2 ⋅ (lower − 0.5)
0.5 0.5
upper := 2 ⋅ (upper − 0.5)
0 0
lower > 0.25 and upper < 0.75: 1 1
Add one to the number p of pending bits
lower = 2 ⋅ (lower − 0.25) 0.5 0.5
upper = 2 ⋅ (upper − 0.25)
0 0
SEAC-4 J.Teuhola 2014 84
Decoder operation

Reads a sufficient number of bits to determine the first

symbol (unique interval of cumulative probabilities).
Imitates the encoder: performs the same scalings, after
the symbol is determined
Scalings drop the ‘used’ bits, and new ones are read in.
No pending bits.

SEAC-4 J.Teuhola 2014 85

Implementation with integer arithmetic

Use symbol frequencies instead of probabilities

Replace [0, 1) by [0, 2k−1)
Replace 0.5 by 2k-1−1
Replace 0.25 by 2k-2−1
Replace 0.75 by 3⋅2k-2−1

Formulas for computing the next interval:

upper := lower + (range ⋅ cum[symbol] / total_freq) − 1
lower := lower + (range ⋅ cum[symbol−1] / total_freq)

Avoidance of overflow: range ⋅ cum() < 2wordsize

Avoidance of underflow: range > total_frequency
SEAC-4 J.Teuhola 2014 86
Solution to avoiding over-/underflow

Due to scaling, range is always > 2k-2

Both overflow and underflow are avoided, if
total_freq < 2k-2, and 2k−2 ≤ w = machine word

Suggestion:
Present total_freq with max 14 bits, range with 16 bits

Formula for decoding a symbol x from a k-bit value:

⎢ (value − lower + 1) ⋅ total _ freq − 1⎥

cum( x − 1) ≤ ⎢ ⎥ < cum( x )
⎣ upper − lower + 1 ⎦

SEAC-4 J.Teuhola 2014 87

4.4.1. Adaptive arithmetic coding

Advantage of arithmetic coding:

Used probability distribution can be changed at any time,
but synchronously in the encoder and decoder.
Adaptation:
Maintain frequencies of symbols during the coding
Use the current frequencies in reducing the interval
Initial model; alternative choices:
All symbols have an initial frequency = 1.
Use a placeholder (NYT = Not Yet Transmitted) for the
unseen symbols, move symbols to active alphabet at the
first occurrence.

SEAC-4 J.Teuhola 2014 88

Basic idea of adaptive arithmetic coding

Alphabet: {A, B, C, D}
Message to be coded: “AABAAB …”

D D D D
D
C
C C
C
Intervals C B
B B
B
B
A A A
A
A

Frequencies {1,1,1,1} {2,1,1,1} {3,1,1,1} {3,2,1,1} {4,2,1,1}

Interval size 1 1/4 1/10 1/60 3/420
SEAC-4 J.Teuhola 2014 89
Adaptive arithmetic coding (cont.)

Biggest problem:
Maintenance of cumulative frequencies; simple vector
implementation has complexity O(q) for q symbols.

General solution:
Maintain partial sums in an explicit or implicit binary tree
structure.
Complexity is O(log2 q) for both search and update

SEAC-4 J.Teuhola 2014 90

Tree of partial sums

264

121 143

81 62
67 54

54 13 22 32 60 21 15 47

A B C D E F G H

SEAC-4 J.Teuhola 2014 91

Implicit tree of partial sums

1 2 3 4 5 6 7 8

f f1+f2 f3 f1+...+f4 f5 f5+f6 f7 f1+...+f8

9 10 11 12 13 14 15 16

f9 f9+f10 f11 f9+...+f12 f13 f13+f14 f15 f1+...+f16

Correct indices are obtained by bit-level operations.

SEAC-4 J.Teuhola 2014 92

4.4.2. Arithmetic coding for a binary alphabet

Observations:
Arithmetic coding works as well for any size of alphabet,
contrary to Huffman coding.
Binary alphabet is especially easy: No cumulative
probability table.
Applications:
Compression of black-and-white images
Any source, interpreted bitwise

Speed enhancement:
Avoid multiplications
Approximations cause additional redundancy

SEAC-4 J.Teuhola 2014 93

Arithmetic coding for binary alphabet (cont.)

Note:
Scaling operations need only multiplication by two,
implemented as shift-left.
Multiplications appearing in reducing the intervals are the
problem.

Convention:
MPS = More Probable Symbol
LPS = Less Probable Symbol
The correspondence to actual symbols may change
locally during the coding.

SEAC-4 J.Teuhola 2014 94

Skew coder (Langdon & Rissanen)

Idea: approximate the probability p of LPS by 1/2Q for

some integer Q > 0.
Choose LPS to be the first symbol of the alphabet
(can be done without restriction)
Calculating the new range:
For LPS: range ← range >> Q;
For MPS: range ← range − (range >> Q);

Approximation causes some redundancy

Average number of bits per symbol (p = exact prob):
1
pQ − (1 − p) log 2 (1 − Q )
2

SEAC-4 J.Teuhola 2014 95

Solving the ‘breakpoint’ probability p̂
Choose Q to be either r or r+1, where r = ⎣−log2p⎦
Equate the bit counts for rounding down and up:
1 1
pˆ r − (1 − pˆ ) log 2 (1 − r ) = pˆ (r + 1) − (1 − pˆ ) log 2 (1 − r +1 )
2 2
which gives
z 1 − 1 / 2 r +1
pˆ = where z = log 2
1+ z 1 − 1 / 2r

SEAC-4 J.Teuhola 2014 96

Skew coder (cont.)

Probability approximation table:

Probability range Q Effective probability
0.3690 – 0.5000 1 0.5
0.1820 – 0.3690 2 0.25
0.0905 – 0.1820 3 0.125
0.0452 – 0.0905 4 0.0625
0.0226 – 0.0452 5 0.03125
0.0113 – 0.0226 6 0.015625

Proportional compression efficiency:

entropy − p log p − (1 − p ) log(1 − p)
=
averageLength − pQ − (1 − p ) log(1 − 1 / 2Q )
SEAC-4 J.Teuhola 2014 97
QM-coder
One of the methods for e.g. black-and-white images
Others:
Q-coder (predecessor of QM, tailored to hardware impl. / IBM)
MQ-coder (in JBIG2; Joint Bi-Level Image Compression Group)
M-coder (in H.264/AVC video compression standard)

Tuned Markov model

(finite-state automaton) for
lower+range
adapting probabilities. range ⋅ p
lower+range⋅(1-p)
Interval setting:
range⋅(1-p)
MPS is the ‘first’ symbol
Maintain lower and range:
lower

SEAC-4 J.Teuhola 2014 98

QM-coder (cont.)
Key ideas:
Operate within interval [0, 1.5)
Rescale when range < 0.75
Approximate range by 1 in multiplications
range ⋅ p ≈ p
range ⋅ (1−p) ≈ range − p
No pending bits, but a ‘carry’ bit can propagate to the
output bits, which must be buffered. Unlimited
propagation is prevented by ‘stuffing’ 0-bits after bytes
containing only 1’s (small redundancy).
Practical implementation is done using integers
within [0, 65536).

SEAC-4 J.Teuhola 2014 99

4.4.3. Practical problems with arithmetic coding
Not partially decodable nor indexable:
Start decoding always from the beginning even to recover
a small section in the middle.
Vulnerable: Bit errors result in a totally scrambled message
Not self-synchronizable, contrary to Huffman code

Solution for static distributions: Arithmetic Block Coding

Applies the idea of arithmetic coding within machine words
Restarts a new coding loop when the word bits are ‘used’.
Resembles Tunstall code, but no explicit codebook.
Fast, because avoids the scalings and bit-level operations.
Non-optimal code length, but rather close

SEAC-4 J.Teuhola 2014 100

HSM Mainhelp
No ratings yet
HSM Mainhelp
194 pages
DNN NeuroSim V2.1 User Manual
No ratings yet
DNN NeuroSim V2.1 User Manual
34 pages
Tkinter Tutorial
No ratings yet
Tkinter Tutorial
157 pages
Nipun Gupta Resume CS v3.2
No ratings yet
Nipun Gupta Resume CS v3.2
4 pages
Coriant Training Services: 8600 Series Provisioning Using 8000 INM
No ratings yet
Coriant Training Services: 8600 Series Provisioning Using 8000 INM
148 pages
Coolair Manual: 一、Audit Menu
67% (3)
Coolair Manual: 一、Audit Menu
4 pages
Lab Manual: Sri Sri University
No ratings yet
Lab Manual: Sri Sri University
97 pages
Microsoft Official Course: Building Responsive Pages in Applications
No ratings yet
Microsoft Official Course: Building Responsive Pages in Applications
17 pages
Current Log
No ratings yet
Current Log
45 pages
Working With Inertial Measurement Unit
No ratings yet
Working With Inertial Measurement Unit
20 pages
Chapter 13 Embedded Operating Systems
No ratings yet
Chapter 13 Embedded Operating Systems
27 pages
TI-06-Stream Codes
No ratings yet
TI-06-Stream Codes
88 pages
Lpic-101 Linuxacademy
No ratings yet
Lpic-101 Linuxacademy
82 pages
Unit 4 Python L
No ratings yet
Unit 4 Python L
39 pages
Quiz 1 & 2
No ratings yet
Quiz 1 & 2
4 pages
Arithmetic Coding (Float Binary) Leangroup Org
No ratings yet
Arithmetic Coding (Float Binary) Leangroup Org
49 pages
Types of Malware and Its Nature of Attack
100% (1)
Types of Malware and Its Nature of Attack
3 pages
Process and Control System (General)
No ratings yet
Process and Control System (General)
16 pages
Sir Syed University of Engineering & Technology, Karachi
No ratings yet
Sir Syed University of Engineering & Technology, Karachi
20 pages
Secure Shell
No ratings yet
Secure Shell
20 pages
Armatura One 5 Fold Brochure 10mar2023
No ratings yet
Armatura One 5 Fold Brochure 10mar2023
2 pages
VC Product Wiring Diagrams
No ratings yet
VC Product Wiring Diagrams
42 pages
The Roles of Various Personnel in Computer-Related Professions
No ratings yet
The Roles of Various Personnel in Computer-Related Professions
17 pages
Assignment 4 (Given: Nov 13, Due: Nov 20) - No Extensions
No ratings yet
Assignment 4 (Given: Nov 13, Due: Nov 20) - No Extensions
1 page
A Machine Learning Perspective On Predictive Coding With PAQ
No ratings yet
A Machine Learning Perspective On Predictive Coding With PAQ
30 pages
Lec 05 - Arithmetic Coding
No ratings yet
Lec 05 - Arithmetic Coding
44 pages
Cyber Security Policy
No ratings yet
Cyber Security Policy
15 pages
CS Year 10 Theory June 2022
No ratings yet
CS Year 10 Theory June 2022
14 pages
Practical Implementations of Arithmetic Coding
No ratings yet
Practical Implementations of Arithmetic Coding
32 pages
Notes 7 2013 - Arithmetic Coding
No ratings yet
Notes 7 2013 - Arithmetic Coding
34 pages
A Review of Data Compression Techniques
No ratings yet
A Review of Data Compression Techniques
9 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
20 pages
Lecture 4 - Arithmetic Coding and Lempel-Ziv
No ratings yet
Lecture 4 - Arithmetic Coding and Lempel-Ziv
26 pages
A Time-Domain Based Lossless Data Compression Technique
No ratings yet
A Time-Domain Based Lossless Data Compression Technique
4 pages
CDI15-04 - Arithmetic Coding
No ratings yet
CDI15-04 - Arithmetic Coding
17 pages
Context-Based Adaptive Arithmetic Coding
No ratings yet
Context-Based Adaptive Arithmetic Coding
13 pages
Image Compression-Decompression Technique Using Arithmetic Coding
No ratings yet
Image Compression-Decompression Technique Using Arithmetic Coding
12 pages
A Universal Data Compression System
No ratings yet
A Universal Data Compression System
9 pages
A Novel Approach of Data Compression For Dynamic Data
No ratings yet
A Novel Approach of Data Compression For Dynamic Data
7 pages
A Tutorial On Hidden Markov Models - Dugad and Desai
No ratings yet
A Tutorial On Hidden Markov Models - Dugad and Desai
16 pages
Hive - Hands On Exercises: Intellipaat Software Solutions Pvt. LTD
No ratings yet
Hive - Hands On Exercises: Intellipaat Software Solutions Pvt. LTD
8 pages
A Unique Perspective On Data Coding and Decoding
No ratings yet
A Unique Perspective On Data Coding and Decoding
11 pages
Mainframe KT Chennakesavan
No ratings yet
Mainframe KT Chennakesavan
10 pages
Netskope Cloud-Firewall
No ratings yet
Netskope Cloud-Firewall
2 pages
A Novel Encoding Algorithm For Textual Data Compression
No ratings yet
A Novel Encoding Algorithm For Textual Data Compression
14 pages
Gender and Age Detection
No ratings yet
Gender and Age Detection
16 pages
Arithmetic Coding
No ratings yet
Arithmetic Coding
6 pages
Camera Based Luggage Bag: Supervisor
No ratings yet
Camera Based Luggage Bag: Supervisor
12 pages
To Embedded Systems Design: Advance Technology
No ratings yet
To Embedded Systems Design: Advance Technology
14 pages
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6440)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (642)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (998)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1174)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (279)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1018)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4360)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2884)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)

4.4. Arithmetic Coding

Uploaded by

4.4. Arithmetic Coding

Uploaded by

4.4.

SEAC-4 J.Teuhola 2014 71

SEAC-4 J.Teuhola 2014 72

SEAC-4 J.Teuhola 2014 73

0.0 0.4 0.4 0.508 0.5164 0.5164

SEAC-4 J.Teuhola 2014 74

SEAC-4 J.Teuhola 2014 75

Practical problems to be solved:

Representation of the final binary code:

SEAC-4 J.Teuhola 2014 77

midpoint ≠ lower and upper

 upper: 0.517072 = .10000100010111101...

SEAC-4 J.Teuhola 2014 78

 Final range length:

-log20.000002764 ≈ 18.46 = entropy

SEAC-4 J.Teuhola 2014 79

SEAC-4 J.Teuhola 2014 80

 The above theorem holds only for prefix-free messages.

Use a special end-of-message symbol, with prob = 1/n where n

SEAC-4 J.Teuhola 2014 82

upper: 0.011010… 0.11010…

upper: 0.110100… 0.10100… and transmit 1

 Note: The common bit also in midpoint value.

 Reads a sufficient number of bits to determine the first

SEAC-4 J.Teuhola 2014 85

 Use symbol frequencies instead of probabilities

Formulas for computing the next interval:

Avoidance of overflow: range ⋅ cum() < 2wordsize

 Due to scaling, range is always > 2k-2

Formula for decoding a symbol x from a k-bit value:

⎢ (value − lower + 1) ⋅ total _ freq − 1⎥

SEAC-4 J.Teuhola 2014 87

Advantage of arithmetic coding:

SEAC-4 J.Teuhola 2014 88

Frequencies {1,1,1,1} {2,1,1,1} {3,1,1,1} {3,2,1,1} {4,2,1,1}

SEAC-4 J.Teuhola 2014 90

SEAC-4 J.Teuhola 2014 91

f f1+f2 f3 f1+...+f4 f5 f5+f6 f7 f1+...+f8

f9 f9+f10 f11 f9+...+f12 f13 f13+f14 f15 f1+...+f16

Correct indices are obtained by bit-level operations.

SEAC-4 J.Teuhola 2014 92

SEAC-4 J.Teuhola 2014 93

SEAC-4 J.Teuhola 2014 94

 Idea: approximate the probability p of LPS by 1/2Q for

 Approximation causes some redundancy

SEAC-4 J.Teuhola 2014 95

SEAC-4 J.Teuhola 2014 96

Probability approximation table:

Proportional compression efficiency:

 Tuned Markov model

SEAC-4 J.Teuhola 2014 98

SEAC-4 J.Teuhola 2014 99

Solution for static distributions: Arithmetic Block Coding

SEAC-4 J.Teuhola 2014 100

You might also like

upper: 0.517072 = .10000100010111101...

Final range length:

The above theorem holds only for prefix-free messages.

Note: The common bit also in midpoint value.

Reads a sufficient number of bits to determine the first

Use symbol frequencies instead of probabilities

Due to scaling, range is always > 2k-2

Idea: approximate the probability p of LPS by 1/2Q for

Approximation causes some redundancy

Tuned Markov model