A High Performance Lossless Image Coder
A High Performance Lossless Image Coder
ABSTRACT 1. INTRODUCTION
This paper 1 proposes a lossless image compression Lossless image compression has been an attractive
scheme integrating modern predictors and the latest subject to be studied for decades due to its importance in
version of Minimum Rate Predictor (MRP). The latest many applications. During the recent years, many lossless
version of Minimum Rate Predictor, which is abbreviated image compression schemes have been proposed basing
as MRP in this paper, is admitted as the most successful on the combination of the rationale of adaptive prediction
method for lossless grayscale image compression so far. and adaptive entropy coding [1][2][3][4]. Quite a few
In the proposed method, the linear predictor is designed adaptive predictors are proposed including the
as the combination of causal neighbors and modern well-known median edge detector (MED) in JPEG-LS
predictors (GAP, MED, and MMSE) to improve the standard [1], and Gradient-adjusted prediction (GAP) in
coding rate. To further reduce the residual entropy, we CALIC [2]. Furthermore, the best single predictor is
also change the calculation of context quantization and generally devised in a minimum mean square error
disposition of neighboring pixels. The modifications (MMSE) sense. MMSE-based predictors [3][4] are
made in the proposed method are crucial in improving the carried out through linear regression operations on causal
compression performance. Experimental results neighbors and provide edge-directed property to adapt
demonstrate that the coding rate of the proposed method well on the edge regions [4].
is lower than that of MRP among most of the test images. However, the best predictor is not necessarily
In addition, the residual entropy of the proposed scheme optimal in aspect of compression ratio. From this point of
in the first iteration is lower than that of the MRP and is view, Matsuda, Mori, and Itoh [5][6] developed a novel
relatively closer to the final residual entropy than that in scheme called Minimum Rate Predictor (MRP), which
the MRP. This phenomenon will allow our proposed can increase the coding efficiency through minimizing the
scheme being terminated in less iteration while coding rates by adaptively selecting the probability
maintaining relatively good compression performance. models for each block and performing block classification
iteratively. The success of MRP is recognized as
Keywords: Lossless image compression, Minimum Rate appropriate in designing a scheme for minimum coding
Predictor, adaptive predictor, residual entropy rates but not for a single superior predictor.
Although the performance of MRP generally
outperforms single-predictor-based coders, the excellence
1 of modern predictors for lossless image compression is
This work is supported by MOEA under grant
93-EC-17-A-02-S1-032.
still attractive. To synthesize the excellence of modern
predictors and the superiority of MRP, a lossless image
coding method is proposed in this paper by integrating
modern predictors and block-adaptive prediction. In our
proposed method, the linear predictor is designed as the
combination of causal neighbors and three predictors,
which are MED, GAP, and MMSE, to reduce the coding
rate. To further reduce the residual entropy obtained in the Fig 1. Disposition of P0 ’s. neighboring pixels.
integration, we also change the calculation of context
quantization and the disposition of neighboring pixels. which is different from that of MRP [6]. The reason is
Experimental results verify that our proposed scheme that the MMSE predictor will usually perform better
indeed reduces the coding rates for most of the test based upon this new disposition, which will lead to better
images. Moreover, the residual entropy of the proposed overall compression performance as verified in the
scheme in the first iteration (initial residual entropy) is experiments.
lower than that of the MRP and is relatively closer to the Next, some notations used in the paper are defined.
final residual entropy than that in the MRP. This A prediction residual ε produced by a single predictor
phenomenon will allow our proposed scheme being is defined as:
terminated in less iteration while maintaining relatively ∧
ε = S ( P0 ) − S ( P0 ) , (1)
good compression performance.
The rest of this paper is organized as follows. In the where S ( Pk ) is the value of image signals at pixel Pk ,
next section, we will introduce the three adaptive ∧
and S ( P0 ) is the predicted value, which is guessed
predictors, MED, GAP, and MMSE. A system overview is
given in Section 3 to provide a brief review of basing on a causal neighborhood of available past data
block-adaptive prediction scheme and the modifications S ( Pk ) . On the other hand, if ε is produced by a linear
made in our approach will also be addressed in the predictor, then
section. Experimental results are illustrated in Section 4 K
ε = S ( P0 ) − ∑ a k S ( Pk ) , (2)
to demonstrate the validity and effectiveness of our k =1
proposed approach. Finally, concluding remarks are given where a k s are prediction coefficients.
in Section 5. In addition, the algorithm determines a context in
which P0 occurs. The context is defined as a function of
2. ADAPTIVE PREDICTOR a causal template. A probabilistic model for the prediction
residual ε conditioned under the context of P0 is then
A brief review of MED, GAP and MMSE predictors is chosen for further arithmetic coding.
given in this section. First, let us introduce some basic
notations and concepts used in this paper. As shown in 2.1 MED
Fig 1, P0 is the current pixel in the raster scan order and
Pi (0 < i < ∞ ) is P0 ’s i-th neighbor. In our work, we MED is a simple and efficient predictor and has been
change the disposition of pixels as illustrated in Fig 1 standardized into JPEG-LS [1]. The calculation of MED
predictor only consists of three neighbors: P1 , P2 , and causal neighbors by minimizing the mean square errors,
P3 . Let a = S ( P1 ) , b = S ( P2 ) , and c = S ( P3 ) . The that is to minimize the term
prediction residual of P0 is defined as: ~ ~ ~ 2
T
T N
min(a , b), if c ≥ max(a, b) ∑ ε
i
2
= ∑ i ∑ a k S ( Pik ) ,
i =1
S ( P ) − (3)
∧
S ( P0 ) = max(a, b), if c ≤ min(a, b)
i =1 k =1
a + b − c, ~ ~
otherwise where N is a prediction order (we take N = 12 here),
According to reference [1], MED predictor tends to and Pik represents the k-th neighbors of pixel Pi .
choose a (west neighbor) when there is a horizontal Equation (3) can be rewritten in a matrix form:
r r r
edge, and to choose b (north neighbor) when there is a ε = y − Ca 2 , (4)
vertical edge. where
S ( P11 ) L S ( P1N~ ) ε1 S ( P1 )
, εr = M , yr = M ,
2.2 GAP C= M M
S ( PT~1 ) L S ( PT~N~ ) ε T~ S ( PT~ )
Let n = S ( P2 ) , w = S ( P1 ) , ne = S ( P4 ) ,
a1
nw = S ( P3 ) , nn = S ( P6 ) , ww = S ( P5 ) , and r
and a = M .
nne = S ( P9 ) , which represent the north, west, northeast,
a N~
northwest, north-north, west-west, and north-northeast
neighbors of P0 , respectively. Two gradient functions The prediction coefficients can be solved through
(vertical and horizontal gradients) can then be estimated the well-known least-square optimization:
r r
by a = (C T C ) −1 (C T y ). (5)
d h = w − ww + n − nw + n − ne , The using of two rectangular training windows can
d v = w − nw + n − nn + ne − nne . effectively reduce the computational complexity of the
The core algorithm of GAP predictor is described as MMSE predictor. The details can be found in [3][4].
follows:
∧ 3. THE APPROACH
IF (d v − d h > 80) S ( P0 ) = w
∧
ELSE IF ( d v − d h < −80) S ( P0 ) = n
ELSE { In this section, the system overview is presented to
∧
S ( P0 ) = ( w + n ) / 2 + (ne − nw) / 4 provide a brief review of block-adaptive prediction
∧ ∧
IF (d v − d h > 32) S ( P0 ) = ( S ( P0 ) + w) / 2 scheme and the modifications made in our approach will
∧ ∧
ELSE IF ( d v − d h > 8) S ( P0 ) = (3 × S ( P0 ) + w) / 4 also be addressed. In this section, we only emphasize the
∧ ∧
ELSE IF ( d v − d h < −32) S ( P0 ) = ( S ( P0 ) + n ) / 2 modifications made in our work from MRP and discard
∧ ∧
ELSE IF ( d v − d h < −8) S ( P0 ) = (3 × S ( P0 ) + n ) / 4 some details which have already been stated in literature
}
[6]. From experimental results, we know that the
modifications are crucial in improving the compression
2.3 MMSE performance.
There are four major steps in the block-adaptive
The purpose of MMSE-based predictors [3][4] is to find prediction scheme. The task to be performed in the first
~ step is the initialization. The algorithm partitions an input
an optimal linear predictor for a fixed number T of
image into blocks composed of 8 × 8 pixels, and then assigned to the n-th context if
initially classifies all the blocks into M classes. The Thm ( n − 1) ≤ U ( P0 ) < Thm ( n ) . Then, the algorithm
algorithm always arranges a single linear prediction for adopts the generalized Gaussian function to model the
all blocks in each class. Hence, an initial predictor is probability distribution of error index E0 . The
needed in the first step. formulation of generalized Gaussian function is
Now, let us give the mathematical representation of cn
Γ ( 3 / cn ) E0
− ⋅
the linear predictor of a class as follows: Γ (1 / c n ) 2 σ n
K
Pn ( E0 ) = α n e , (10)
ε = S ( P0 ) − ∑ a m ( k ) S ( Pk ) , (7)
k =1 where Pn ( E0 ) is the probability of E 0 in the n-th
where a m (k ) s are coefficients of the m-th (m = 1, 2, …, context, e is the natural logarithm, Γ(⋅) is the gamma
M) class. In our approach, we change the designation of function, α n is a normalizing factor, cn is the shape
predictors. The modern predictors as mentioned in parameter, and σ n is the parameter of standard
Section 2 are utilized as a part of the observation values, deviation. Here, we use the same settings as those in
i.e. MRP.
K −3 2 The coding length of an error index E 0 is
ε = S ( P0 ) − ∑ a m ( k ) S ( Pk ) + ∑ a m ( K − j ) R j ( P0 ) ,
k =1 j =0 − log 2 (Pn ( E 0 ) ) so that the total coding length of the
(8) m-th class is
where R0 ( P0 ) , R1 ( P0 ) , and R2 ( P0 ) are the prediction
Lm = − ∑ log (P ( E ) ) .
P0∈m − th class
2 n 0
(11)
values of P0 by MMSE, GAP, and MED predictors,
respectively. In the prediction step, coefficient optimization is
The next step is a prediction step that optimizes the carried out to minimize every Lm through several partial
coefficients in each class, i.e., minimizing the total coding optimizations in each class and the details can be found in
length of prediction errors. After prediction, every [6]. In MRP, the optimization for the coding length of
prediction error ε is mapped into a non-negative integer block labels is also considered.
E0 , which is called an error index thereafter. Values of In the third step, dynamic programming is
error index E 0 are allocated to all possible values of ε performed to determine the thresholds
in increasing order of | ε | . It also defines the N { Thm (1), Thm (2), ..., Thm ( N − 1) } such that Lm is
contexts based on the quantization of a parameter U . In minimized for each m. Next, the optimum value of shape
our method, the calculation of U differs from that in parameter cn in each context is determined by
MRP, which is defined as minimizing the cost function J n for the n-th context:
1 6 2 Jn = − ∑ log (P ( E ) ) ,
2 n 0 (12)
U ( P0 ) = λ ∑ E k + 0.5 , (9) P0∈n − th context
6 k =1
The tasks performed in the last step is to reclassify
where λ is a scaling factor, and E k is the neighboring every class by selecting the optimum class, and then
error index at pixel Pk . The algorithm gives N − 1 repeat the algorithm from the prediction step until it
thresholds { Thm (1), Thm (2), ..., Thm ( N − 1) } for the converges or reaches a certain maximum number of steps.
m-th class in a non-decreasing order (also let An arithmetic coder is applied thereafter on the prediction
Thm (0) = 0 , Thm (N ) = ∞ ). A pixel P0 in this class is errors.
Proposed
Image MRP TMW CALIC JPEG-LS
scheme
(j) (k)
Table 1, we can find that the performance (coding rates)
of MRP is always the best (except the test image “shapes”)
among MRP, TMW, CALIC, and JPEG-LS. Comparing
with the coding rates of MRP, our proposed scheme can
(l) (m) further successfully reduce 0.01 to 0.02 bpp coding rates
Fig 2. Test image set (8 bit grayscale): (a) camera, (b) in most of the test images.
couple, (c) noisesquare, (d) airplane, (e) baboon, (f) lena,
In the second part of the experiments, the residual
(g) lennagrey, (h) peppers, (i) shapes, (j) balloon, (k) barb,
(l) barb2, (m) goldhill. entropy of our proposed method in the first ten iteration
and that of MRP are compared. Fig 3 illustrates the
4. EXPERIMENTAL RESULTS reduction trend of residual entropy in the first ten iteration
of predictor initialization for both our proposed scheme
In this section, experimental results are illustrated to and the MRP. The proposed scheme always exhibits
demonstrate the validity and effectiveness of the proposed lower residual entropy at the starting iteration even for the
scheme. Firstly, the coding rates of our proposed scheme, image “balloon” (Fig 3(d)). In the image “balloon”,
MRP, TMW [7], CALIC, and JPEG-LS on common although the final residual entropy of our proposed
grayscale test images (see Fig 2) are tabulated in Table 1. scheme is slightly inferior to that of MRP after the default
A special case in these test images is “shapes”, which is a iteration (or after convergence), the initial residual
CG image while the others are natural images. From entropy (in the first iteration) of our proposed scheme is
Proposed
3.64 Proposed scheme Image MRP
coding scheme
3.62 Latest MRP
entropy
airplane 98.55% 97.87%
(bpp) 3.6
3.58 baboon 99.25% 97.68%
3.56
3.54 balloon 98.30% 96.21%
1 2 3 4 5 6 7 8 9 10 barb 92.99%
97.46%
iteration (times)
barb2 98.04% 96.08%
2.58
a limited number of iterations, while preserving low
coding rates. For example, the initial residual entropy
2.54
1 2 3 4 5 6 7 8 9 10
almost catches up with the final residual entropy of the
iteration (times)
test image “baboon” (Fig 3(b)). Hence, satisfactory
6. REFERENCES