lec05 arithmetic coding II
lec05 arithmetic coding II
Zhu Li
Dept of CSEE, UMKC
http://l.web.umkc.edu/lizhu
Office: FH560E, Email: [email protected], Ph: x 2346.
Course web:http://sites.google.com/view/ece5578
slides created with WPS Office Linux and EqualX LaTex equation editor
Arithmetic Coding
Replace the entire input seq with a single floating-
point number
Does not need the probability distribution
Adaptive coding is very easy
No need to keep and send codeword table
Fractional codeword length per symbol
Z. Li, ECE/CS 5578 Multimedia Communciation p.8
Introduction
Symbol Prob. 1 2 3
1 0.8
2 0.02 0 0.8 0.82 1.0
1 2 3
Range 0.144
0.656 0.7712 0.77408 0.8
1 2 3
Range 0.00288
0.7712 0.773504 0.7735616 0.77408
Termination: Encode the lower end or midpoint to signal the end.
Difficulties: 1. Shrinking of interval requires very high precision for long sequence.
2. No output is generated until the entire sequence has been processed.
Z. Li, ECE/CS 5578 Multimedia Communciation p.13
Encoder Pseudo Code
Cumulative Density Function (CDF) Probability Mass Function
For continuous distribution: 0.4
x
0.2 0.2 0.2
FX ( x) = P( X £ x) = ò p( x)dx
-¥
i
FX (i ) = P ( X £ i ) = å P( X = k )
k = -¥ CDF 0.8
1.0
0.4
P( X = i ) = FX (i ) - FX (i - 1). 0.2
Properties:
X
Non-decreasing 1 2 3 4
Piece-wise constant
Each segment is closed at the lower end.
Z. Li, ECE/CS 5578 Multimedia Communciation p.14
Encoder Pseudo Code
Keep track of LOW=0.0, HIGH=1.0;
while (not EOF) {
LOW, HIGH, RANGE n = ReadSymbol();
Any two are RANGE = HIGH - LOW;
sufficient, e.g., HIGH = LOW + RANGE * CDF(n);
LOW = LOW + RANGE * CDF(n-1);
LOW and RANGE. }
output LOW;
1 2 3
Decode 2 * CDF
0.656 0.7712 0.77408 0.8
1 2 3
Decode 1
0.7712 0.773504 0.7735616 0.77408
010 011
æé 1 ù ö æ 1 ö
2).
-çç ê log
p ( X )
ú +1 ÷÷ -çç log +1 ÷÷
p( X ) ø 1
2 -l ( X ) = 2 èê ú ø
£2 è
= p( X )
2
By def,
1
T(X) = F(X - 1) + p ( X ), p(X) > 0.
2 F(X)
1
T(X) - F(X - 1) = p( X ) ³ 2-l(X)
2
Together with T(X)- êëT(X) úû l ( X ) £ 2 - l ( X ) T(X)
£ 2-l ( X )
ëT(X)û l ( X ) ³ F ( X - 1). ³ 2-l ( X )
ëT(X)û l ( X )
F(X-1)
Thus F ( X - 1) £ ëT(X)û l ( X ) < F ( X )
So the truncated code is still in the interval. This proves the uniqueness.
Z. Li, ECE/CS 5578 Multimedia Communciation p.23
Uniqueness and Efficiency
Efficiency of arithmetic code:
m é 1 ù X 1m : { x1 ,..., xm }
l ( X ) = êlog
1 m ú
+ 1 bits.
ê p( X 1 ) ú
æé 1 ù ö
{ 1
m
ç
1
m
}
L = E p( X )l ( X ) = å P( X )ç êlog 1
m
m ú
p( X 1 ) ú
+ 1 ÷÷
èê ø
æ 1 ö
£ å P( X )çç log1
m
m
+ 1 + 1 ÷
÷ = H ( X m
1 )+2
è p( X 1 ) ø
*
H ( X ) £ L £ H ( X ) +1
0 x 1
Tag
0 0.6 1
Key Observation:
As the RANGE reduces, many MSB’s of LOW and HIGH become
identical:
Example: Binary form of 0.7712 and 0.773504:
0.1100010.., 0.1100011..,
We can output identical MSB’s and re-scale the rest:
Incremental encoding
Can achieve infinite precision with finite-precision integers.
Three scaling scenarios: E1, E2, E3
0 0.5 1.0
Output 0, then shift left by 1 bit
[0, 0.5) [0, 1): E1(x) = 2 x
0.0848 0.09632
E1: 2x, Output 0
0.1696 0.19264 E1: Output 0
EncodeSymbol(n) {
//Update variables
RANGE = HIGH - LOW;
HIGH = LOW + RANGE * CDF(n);
LOW = LOW + RANGE * CDF(n-1);
0.0848 0.09632
E1: Output 0
0.1696 0.19264
With E3:
0.312 0.6
Without E3:
Input 2
0.312 0.5424 0.54816 0.6 E2: Output 1
0.1696 0.19264
With E3:
0.312 0.6
Need E1 scaling
Need E2 scaling
Need E3 scaling
No scaling is required.
Continue to
encode/decode the next
symbol.