Lossless Data Hiding Using Histogram Shifting Method Based On Integer Wavelets
Lossless Data Hiding Using Histogram Shifting Method Based On Integer Wavelets
Guorong Xuan1, Qiuming Yao1, Chengyun Yang1, Jianjiong Gao1, Peiqi Chai1
Yun Q. Shi2, Zhicheng Ni2
1Dept. of Computer Science, Tongji University, Shanghai, P. R. China
[email protected]
2Dept. of Electrical & Computer Engineering, New Jersey Institute of Technology
Abstract. This paper1 proposes a histogram shifting method for image lossless
data hiding in integer wavelet transform domain. This algorithm hides data into
wavelet coefficients of high frequency subbands. It shifts a part of the
histogram of high frequency wavelet subbands and thus embeds data by using
the created histogram zero-point. This shifting process may be sequentially
carried out if necessary. Histogram modification technique is applied to prevent
overflow and underflow. The performance of this proposed technique in terms
of the data embedding payload versus the visual quality of marked images is
compared with that of the existing lossless data hiding methods implemented in
the spatial domain, integer cosine transform domain, and integer wavelet
transform domain. The experimental results have demonstrated the superiority
of the proposed method over the existing methods. That is, the proposed
method has a larger embedding payload in the same visual quality (measured by
PSNR (peak signal noise ratio)) or has a higher PSNR in the same payload.
1 Introduction
This paper focuses on the image lossless data hiding, which requires not only correct
retrieval of the hidden data but also inverting the marked image back to the original
cover image without any distortion.
Recently, Ni et al. [1,2] proposed an image lossless data hiding algorithm using
pairs of zero-points and peak-points, in which the part of an image histogram is
shifted to embed data. Independently, Leest et al. [3] proposed a similar method.
However, both of these two methods are implemented in the spatial domain. It is
well-known that the histogram distribution varies dramatically from image to image.
1 This research is supported partly by National Natural Science Foundation of China (NSFC)
on the project “The Research of Theory and Key Technology of Lossless Data Hiding
(90304017)”.
2 Guorong Xuan, Yun Q. Shi
Consequently, it is hard for these two methods to achieve high data embedding
payload (often referred to as capacity as well) with a reasonably high visual quality
(often measured by PSNR (peak signal to noise ratio)). Since the wavelet coefficients
of high frequency subbands have Laplacian-like distribution, meaning that there is a
high peak in the histogram around zero and small magnitudes on both sides, we
propose to apply the histogram shifting technique in the wavelet domain. Because of
the losslessness requirement, we chose to work in the integer wavelet transform
domain.
During the shifting of histograms of high-frequency integer wavelet subbands, the
overflow (e.g., the pixel grayscale value exceeding 255 for an 8-bit image) and/or
underflow (e.g., the pixel grayscale value below 0 for an 8-bit image) may take place,
thus violating the losslessness requirement. In order to overcome overflow and/or
underflow, the histogram modification technique is adopted, which have been used in
our previous works on image lossless data hiding using integer wavelet transform [4,
5, 6, 7, 8].
Experimental works have been conducted to compare the performance of this
proposed new technique with that of the existing techniques [1, 2, 7, 9, 10], showing
the superiority of the proposed technique.
The rest of the paper is organized as follows. The integer wavelet transform and
histogram modification are introduced in Section 2. In Sections 3, the algorithm of
wavelet histogram shifting is presented. Some experimental results are reported in
Section 4. Conclusions are drawn in Sections 5.
Since it is required to reconstruct the original image with no distortion, we use the
integer lifting scheme wavelet transform in this framework. Specifically, we adopt the
CDF(2,2) and similar series used in JPEG2000 standard [11]. Table 1 below lists the
forward and inverse transform of CDF(2,2) integer wavelet transform.
Table 1. CDF(2,2) integer wavelet transform.
Forward transform
Splitting: si←x2i;di←x2i+1
Dual lifting: di←di-{(si+si+1)/2}
Primary lifting: si←si+{(di-1+di)/4}
Inverse transform
Inverse primal lifting: si←si-{(di-1+di)/4}
Inverse dual lifting: di←di+{(si+si+1)/2}
Merging: x2i←si;x2i+1←di
After integer wavelet transform, it has four sub-bands. We will embed the
information into three high frequency subbands.
Lossless Data Hiding Using Histogram Shifting Method 3
For a given image, after data embedding in some IWT coefficients, it is possible to
cause overflow/underflow, which means that after inverse wavelet transform the
grayscale values of some pixels in the marked image may exceed the upper bound
(255 for an eight-bit grayscale image) and/or the lower bound (0 for an eight-bit
grayscale image). In order to prevent the overflow/underflow, we adopt histogram
modification, which narrows the histogram from both sides as shown in Figure 1.
Please refer to [8] for the detailed algorithm. The bookkeeping information will be
embedded into the cover media together with the information data.
parts to be merged
0 255
(a)
parts after merge
After integer wavelet transform, the histograms of high frequency subbands, referred
to as wavelet histogram in the rest of this paper, are calculated. There the horizontal
axis represents the wavelet coefficients’ value and the vertical axis the occurrence
numbers of the corresponding wavelet coefficients. As mentioned, Ni et al. [1, 2]
proposed the histogram shifting method in the spatial domain, while independently
Leest et al. [3] proposed the histogram gap function method in the spatial domain.
In the following discussion, we consider a simple example shown in Figure 2 to
demonstrate the principle of data embedding using histogram shifting. There, Figure 2
(a) is the original histogram of an integer wavelet high-frequency subband. In Figure
2 (b), a zero-point (no any coefficients in this subband assume this specific value: Z).
That is, we shift the part of histogram with values larger than Z towards the right-
hand side by one unit. It means the original Z+1 value now becomes Z+2, and the
original Z+2 becomes Z+3 and so on. Another part of the histogram with the value
less than and equal to Z remains unchanged.
(a) (b)
Fig.2. An example showing how a zero point is generated: (a) original histogram (b) histogram
after a zero point is created.
coefficients having value “Z” in the histogram. Note that the sequence in which the
wavelet coefficients are encountered in data embedding can be controlled by using a
key in order to make hidden data secure. If the number of to-be-embedded bits is
large, it usually needs multiple zeros and the corresponding shifting to accommodate
the large payload.
The process of data embedding and data extraction illustrated above is
summarized below. That is, we first shift the histogram shown in Figure 2 starting
from value “Z+1” towards the right-hand one-by-one, leaving the value “Z+1” empty,
i.e., creating a zero-point at “Z+1” in the histogram. Then according to the to-be-
embedded bit sequence, we either keep those coefficients having a value “Z”
unchanged (if embedding a bit “0”) or we change the coefficient from value “Z” to
value “Z+1” (if embedding a bit “1”). During the data retrieval, we extract a bit “0”
from those coefficients having value “Z”. We extract a bit “1” from those coefficients
having value “Z+1”. Furthermore, we reduce the value of the coefficients from “Z+1”
back to “Z”. After all the hidden bits have been extracted out, we need to shift the part
of the histogram larger than “Z+1” towards the left-hand side by one unit.
Since the histogram of IWT high frequency subbands obeys Laplacian-like
distribution, the algorithm can embed data in both sides of the histogram alternatively
until all the to-be-embedded bits are embedded. The proposed data embedding and
data extraction algorithms are presented below in detail.
Assume there are M bits which are supposed to be embedded into a high frequency
subband of IWT. We embed the data in the following way, as shown in Figure 3.
C hoose T
P eak= T
G e n e ra te a z e ro -p o in t
& E m b e d d a ta
Y
F in is h e d ?
N
N
P eak > 0?
P e a k = -P e a k - 1 Y
P e a k = -P e a k
S = P eak
(1) Set a threshold T>0, to let the number of the high frequency wavelet
coefficients in [-T,T] is greater than M. And set the Peak=T.
(2) In the wavelet histogram, move the histogram (the value is greater than Peak)
to the right-hand side by one unit to leave a zero-point at the value Peak+1. Then
embed data in this point.
(3) If there are to-be-embedded data remaining, let Peak = (-Peak), and move
the histogram (less than Peak) to the left-hand side by 1 unit to leave a zero-point at
the value (-Peak-1). And embed data in this point.
(4) If all the data are embedded, then stop here and record the Peak value as stop
peak value, S. Otherwise, Peak =(-Peak-1), go back to (2) to continue to embed the
remaining to-be-embedded data.
Data extraction is the reverse process of data embedding. Assume the stop peak value
is S, the threshold is T. Figure 4 is the data extraction diagram.
Peak=S
Extract data
&Backfill the zero-point
Y
Finished?
N
N
Peak>0?
Peak=-Peak Y
Peak=-Peak-1
The payload with in the Peak equals to the number of IWT coefficients assuming
the Peak value. Once threshold T is set, according to the algorithm, we will embed the
data in the range of [-T, T]. Hence, the total payload is the total number of
coefficients assuming the values in the range of [-T, T]. For instance, if T=5, then the
possible zero-points will be 6, -6, 5, -5, 4, -4, … S, where S is the stop T value. From
threshold T and stop value S, We can calculate the total number of zero-points: 2*(T-
|S|+1)-u(S), where u(.) is the unit step function If there are to-be-embedded data left
when S =0, it shows that the range [-T, T] is not wide enough. We should increase the
threshold T. If the payload is enough, different T will lead to different stop value S
and PSNR of the marked image versus the original cover image. Table 2 shows that if
we embed 0.02 bpp (bits per pixel) data into the Lena image, it achieves the highest
PSNR when T =3. Hence we should choose the T which has the highest PSNR.
Experiments on six frequently used images, i.e., Lena, Baboon, Airplane, House,
Peppers, Sailboat are reported here. From Figure 5, we can find that the visual quality
of the marked images is still acceptable when 131k bits are embedded into these
grayscale images of 512x512x8, i.e., the embedding payload is 0.5 bpp.
8 Guorong Xuan, Yun Q. Shi
Fig.5. PSNR of marked images with a payload of 0.5bpp: (a) Lena: 41.07 dB,
(b) Baboon: 31.18 dB, (c) Airplane: 42.71 dB, (d) House: 40.90 dB, (e) Peppers:
39.71 dB, (f) Sailboat: 36.91 dB.
Table 3 and 4 show the PSNR for different payload in the images Lena and
Baboon. It shows that the increase of threshold T does not always lead to the increase
of payload. When payload is smaller, we can choose larger threshold T. Hence fewer
coefficients are changed during the data embedding and the resultant PSNR is higher.
When payload is larger, the total number of needed zero points and threshold T also
need to be larger.
Figure 6 depicts the performance comparison between our method and several
other most advanced lossless data hiding methods, including Ni et al.’s method [1,2],
Tian’s Difference Expansion [9], Companding on integer DCT [10], and Threshold
Embedding [7]. Note that the performance in terms of PSNR versus data embedding
payload by threshold embedding [7] is superior to that achieved by the methods
reported in [4, 5, 6]. Therefore, this indicates that our proposed method reported in
this paper has the best performance in terms of PSNR versus payload, compared with
these prior arts.
54
The proposed method
52
Difference Expansion
50 Companding on DCT
Threshold Embedding
48
Ni's method
PSNR (dB)
46
44
42
40
38
36
34
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Payload Size (bpp)
Fig.6. Performance comparison of histogram shifting method v.s. several other most
advanced lossless data hiding methods on Lena image.
Table 5 and 6 are the detailed comparison results with Ni et al.’s method. Table 5
shows that our method has higher PSNR while the payload is same. Table 6 shows
that the payload of our method is about four times that in Ni et al.’s method at the
same PSNR. Hence, our method has better performance than Ni et al.’s method.
10 Guorong Xuan, Yun Q. Shi
5 Summary
This paper proposed a novel lossless data hiding method based on the histogram
shifting, integer wavelet transform and histogram modification. The experimental
results and theoretical analysis show that the proposed method has better performance
than the similar methods in the spatial domain, integer DCT domain and integer
wavelet domain. The proposed method has larger payload at the same PSNR.
Especially, the proposed method has very high PSNR while the payload is small.
References
1. Z. Ni, Y. Q. Shi, N. Ansari and W. Su: Reversible Data Hiding. IEEE International
Symposium on Circuits and Systems (ISCAS03), May 2003, Bangkok, Thailand.
2. Z. Ni, Y. Q. Shi, N. Ansari and W. Su: Reversible data hiding. IEEE Transactions on
Circuits and Systems for Video Technology, vol. 16, no. 3, pp. 354-362, March 2006.
3. A. Leest, M. Veen, and F. Bruekers: Reversible Image Watermarking. IEEE Proceedings
of ICIP’03, vol.2, pp.731-734, September 2003.
4. G. Xuan, J. Zhu, J. Chen, Y. Q. Shi, Z. Ni and W. Su: Distortionless Data Hiding Based on
Integer Wavelet Transform. IEE Electronics Letters, vol. 38, no. 25, pp. 1646-1648,
December 2002
5. G. Xuan, Y. Q. Shi, Z. Ni: Lossless Data Hiding Using Integer Wavelet Transform and
Spread Spectrum. IEEE International Workshop on Multimedia Signal Processing
(MMSP04), Siena, Italy, September 2004.
Lossless Data Hiding Using Histogram Shifting Method 11
6. G. Xuan, Y. Q. Shi, Z. Ni: Reversible Data Hiding Using Integer Wavelet Transform and
Companding Technique. Proceedings of International Workshop on Digital Watermarking
(IWDW04), Korea, October 2004
7. G. Xuan, Y. Q. Shi, C. Yang, Y. Zheng, D. Zou, P. Chai,: Lossless data hiding using
integer wavelet transform and threshold embedding technique. IEEE International
Conference on Multimedia and Expo (ICME05), Amsterdam, Netherlands, July, 2005.
8. G. Xuan, C. Yang, Y. Q. Shi and Z. Ni: High Capacity Lossless Data Hiding Algorithms.
IEEE International Symposium on Circuits and Systems (ISCAS04), Vancouver, Canada,
May 2004.
9. J. Tian: Reversible Data Embedding Using a Difference Expansion. IEEE Transactions on
Circuits and Systems for Video Technology, Aug. 2003, 890-896.
10. B. Yang, M. Schmucker, W. Funk, C. Busch, S. Sun: Integer DCT-based Reversible
Watermarking for Images Using Companding Technique. Proceedings of SPIE, Security
and watermarking of Multimedia Content, Electronic Imaging, San Jose (USA), 2004
11. M. Rabbani and R. Joshi: “An Overview of the JPEG2000 Still Image Compression
Standard”, Signal Processing: Image Communication 17 (2002) 3–48.