4.4. Arithmetic Coding
4.4. Arithmetic Coding
Arithmetic coding
Advantages:
Reaches the entropy (within computing precision)
Superior to Huffman coding for small alphabets and
skewed distributions
Clean separation of modelling and coding
Suits well for adaptive one-pass compression
Computationally efficient
History:
Original ideas by Shannon and Elias
Actually discovered in 1976 (Pasco; Rissanen)
Basic ideas:
Message is represented by a (small) interval in [0, 1)
Each successive symbol reduces the interval size
Interval size = product of symbol probabilities
Prefix-free messages result in disjoint intervals
Final code = any value from the interval
Decoding computes the same sequence of intervals
C C C C C C
0.517072
B B B B B B
0.516784
A A A A A A
13 bits
range = 0.00028
log2(1/range) ≈11.76 bits
“ABCDABCABA”
Precise probabilities:
P(A) = 0.4, P(B) = 0.3, P(C) = 0.2, P(D) = 0.1
Theorem 4.2.
Let range = upper − lower be the final probability
interval in Algorithm 4.8. The binary
representation of mid = (upper + lower) / 2
truncated to l(x) = ⎡log2(1/range)⎤ + 1 bits is a
uniquely decodable code for message x among
prefix-free messages.
Proof: Skipped.
Suggestion:
Present total_freq with max 14 bits, range with 16 bits
Alphabet: {A, B, C, D}
Message to be coded: “AABAAB …”
D D D D
D
C
C C
C
Intervals C B
B B
B
B
A A A
A
A
Biggest problem:
Maintenance of cumulative frequencies; simple vector
implementation has complexity O(q) for q symbols.
General solution:
Maintain partial sums in an explicit or implicit binary tree
structure.
Complexity is O(log2 q) for both search and update
264
121 143
81 62
67 54
54 13 22 32 60 21 15 47
A B C D E F G H
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
Observations:
Arithmetic coding works as well for any size of alphabet,
contrary to Huffman coding.
Binary alphabet is especially easy: No cumulative
probability table.
Applications:
Compression of black-and-white images
Any source, interpreted bitwise
Speed enhancement:
Avoid multiplications
Approximations cause additional redundancy
Note:
Scaling operations need only multiplication by two,
implemented as shift-left.
Multiplications appearing in reducing the intervals are the
problem.
Convention:
MPS = More Probable Symbol
LPS = Less Probable Symbol
The correspondence to actual symbols may change
locally during the coding.