Robust Coverless Image Steganography Based On Neglected Coverless Image Dataset Construction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

This article has been accepted for publication in IEEE Transactions on Multimedia.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1

Robust Coverless Image Steganography Based on


Neglected Coverless Image Dataset Construction
Liming Zou, Jing Li, Wenbo Wan, Q. M. Jonathan Wu, Senior Member, IEEE, and Jiande Sun,
Member, IEEE

Abstract—Most of the existing image selection-based coverless requires constructing an image dataset, and then the cover
image steganography methods mainly focus on improving the image is directly selected from the image dataset via the
capacity and robustness under the assumption that the mapping rule according to the secret message. For the sake of
corresponding dataset is available. But they ignore how to brevity, the constructed dataset used for CIS is called the
successfully construct the coverless image dataset, which is the
coverless image dataset (CID). In addition, the secret message
foundation of such methods and has a critical impact on the
capacity. In this paper, a coverless image steganography is is used to represent the bits that an image can hide, which is
proposed that considers how to efficiently construct the coverless different from the secret information mentioned below in this
image dataset. In the proposed method, the CNN-based deep hash paper. The secret information represents the bits that the sender
is extracted from the image and a specific mapping rule is designed intends to hide. In such a selection process, the cover image has
to map the high-dimensional deep hash to the low-dimensional not been modified, so image selection-based CIS can
secret message. In addition, an unsupervised clustering algorithm fundamentally resist the analysis of steganalysis tools. Zhou et
is adopted to construct the coverless image dataset, which makes al. [21] proposed image selection-based CIS via pixels of image
the construction of the coverless image dataset efficient and blocks, which is one of the pioneering methods in this field.
improves the robustness of the proposed steganography method.
Later, many image selection-based CIS methods have been
To our best knowledge, this is the first attempt to improve the
construction efficiency of the coverless image dataset in the field of proposed, such as CIS based on SIFT features [22], CIS based
coverless image steganography. Experimental results show that on SIFT and Bag-of-Features [23], CIS based on the average
the construction of a large coverless image dataset is feasible and pixel value of sub-images [24], and others [25-26]. Most of the
reliable, and the proposed method has better robustness and above-mentioned methods focused on the capacity, while the
higher dataset utilization rate compared with the state-of-the-art robustness is less discussed. However, with the extension of
methods. image selection-based CIS in practical applications, how to
Index Terms—Coverless image steganography, coverless image improve the robustness has attracted increasing attention.
dataset, efficient construction, high robustness. Compared with the secret channel, the image may be subjected
to a variety of processing in the public channel. For example,
I. INTRODUCTION the image may be compressed or scaled to improve the
N recent years, with the development of digital multimedia efficiency of image transmission via social Apps, such as
I and computer network security, more attention has been paid
to information hiding. Traditional information hiding
WeChat. Furthermore, the accuracy of the secret message
extraction will be decreased due to such image processing.
technology embeds data into the carrier for covert Therefore, some researchers are shifted to improve the
communication [1]. The carrier of information hiding can be robustness recently [27-31]. The representative works include
multimedia data, including text [2], image [3], audio [4], video LDA_DCT [29], DenseNet_DWT [30], and CI_CIS [31].
[5], etc. Since the image is the most widely used one today, These three methods can significantly improve the robustness
more researchers focus on the field of image steganography [6]. as compared with the previous methods.
Most of the traditional image steganography methods are based Most of the image selection-based CIS mentioned above only
on the spatial domain [7-14] or frequency domain [15-20], and focus on capacity and robustness and ignore the successful
the secret message is embedded into the image. Thus, the image construction of the CID. In fact, constructing the corresponding
will be modified and there are some modification traces left by CID is a challenging issue for the above CIS methods. In the
such embedding operations. Furthermore, these modification mapping processes of such methods, the lengths of the image
traces left in the image may be detected out successfully via feature and the secret message are restricted to be equal. Thus,
some specific steganalysis tools. the numbers of different secret messages and image feature
To resist steganalysis detection fundamentally, image sequences are the same. To hide all secret messages, the
selection-based coverless image steganography (CIS) is constructed CID needs to contain all the feature sequences.
proposed in recent years. Image selection-based CIS, firstly, However, it is hard to construct a CID that contains all the

This work was supported in part by Scientific Research Leader Studio of [email protected]; [email protected]; [email protected]
Jinan (No. 2021GXRC081), Joint Project for Smart Computing of Shandong m).
Natural Science Foundation (ZR2020LZH015), and Taishan Scholar Project of J. Li is with the School of Journalism and Communication, Shandong Norm
Shandong, China (No. ts20190924). (Corresponding author: Jing Li, Jiande al University, Jinan 250358, China (e-mail: [email protected]).
Sun.) Q. M. Jonathan Wu is with the Department of Electrical and Computer Eng
L. Zou, W. Wan, and J. Sun are with the School of Information Science and ineering, University of Windsor, Windsor N9B3P4, Canada (e-mail: jwu@uwi
Engineering, Shandong Normal University, Jinan 250358, China (e-mail: limi ndsor.ca).

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2

corresponding image features, especially when the length of the II. RELATED WORK
secret message is large. As an example, when the length of the In general, the proposed CIS methods can be divided into two
secret message is 32, the number of different image features categories according to the way of obtaining the cover images,
covered by the constructed CID must be 232 , which is including image generation-based CIS and image selection-
extremely hard in practical applications. Even in some cases, based CIS.
they fail to hide a 16-bit secret message because there are not
enough images to index and use to build the CID. In addition, A. Image generation-based CIS
due to the application of the camouflage image, CI_CIS [31] In image generation-based CIS, the cover image is generated
method needs to share the CID from the sender to the receiver. directly by a certain generation model to hide the secret
This aggravates the communication pressure between the message. The generated image usually does not exist in the real
sender and the receiver, and may bring the potential risk of CID world.
leakage during transmission that can affect the security of Liu et al. [33] proposed a CIS based on the generative
covert communication. adversarial network (GAN), which is one of the pioneering
To decrease the difficulty of constructing the CID, the novel methods in this field of image generation-based CIS. After that,
coverless image steganography based on efficient coverless some image generation-based CIS methods were proposed
image dataset construction is proposed in this manuscript. successively. For example, Duan et al. [34] proposed the CIS
Different from most of the existing CIS methods, the length of based on the Wasserstein GAN (WGAN). Yang et al. [35]
the image feature is unequal to the length of the secret message proposed the CIS based on the autoregressive model. Duan et
in the proposed method. Specifically, the convolutional neural al. [36] proposed the CIS based on the improved Wasserstein
network (CNN) provided in DCMH [32], which is called GAN (WGAN-GP). Cao et al. [37] proposed the CIS based on
DCMH-CNN in this paper, is adopted to extract the deep hashes the generation of anime characters. However, there is great
as the high-dimensional image features. Then, a mapping rule room for improvement in the robustness of the above methods.
is designed to connect the high-dimensional image feature with To improve the robustness of image generation-based CIS, Li
the low-dimensional secret message, which is conducive to et al. [38] proposed an encrypted CIS based on Cycle-GAN and
constructing a large CID successfully. Finally, the construction Chen et al. [39] proposed the CIS based on StarGAN. The
algorithm based on unsupervised clustering is proposed, which irreversibility in some generation models for secret information
makes the construction of a CID easier. With the help of the extraction is a challenging problem. To avoid this challenge, the
unsupervised clustering algorithm, the distance in the feature above-mentioned methods usually build an attribute-secret data
domain among the CID is relatively large, which is conducive mapping table. Then, these attributes are used to guide generate
to improving the robustness of the proposed method. cover images and map to secret messages, which limits the
Experiments show the efficiency of the constructing CID and capacity of the above-mentioned methods. To address the
better robustness as compared with other state-of-the-art irreversibility problem and improve the capacity, Peng et al. [40]
methods. The main contributions of the proposed method are as proposed a CIS based on WGAN-GP and gradient descent
follows. approximation. Zhou et al. [41] proposed a secret-to-image
⚫ A construction algorithm for the CID based on reversible transformation scheme based on the Glow model.
unsupervised clustering is proposed and the dataset Though the image generation-based CIS has been a
utilization rate for measuring the construction efficiency promising technique to resist steganalysis detection, it cannot
is defined in this paper. The proposed method can greatly resist steganalysis detection fundamentally according to the
improve the construction efficiency and the dataset above-related literature. In addition, since the generated images
utilization rate compared with state-of-the-art CIS usually do not exist in the real world, they may be judged by
methods. In addition, the construction of the CID can be state-of-the-art deepfake technologies. Therefore, the image
duplicated entirely at the sender and receiver, which generation-based CIS methods are not the focus of this paper.
avoids the communication pressure of large CID and the
potential risk of information leakage during transmission. B. Image selection-based CIS
⚫ Our method breaks the limitation that the lengths of the In image selection-based CIS, the secret message is
image feature and the secret message are equal in the represented directly by the selected-out image from the existing
previous methods. It also designs a mapping rule to dataset via a mapping rule, rather than embedded into the image.
connect high-dimensional image features with low- Thus, there is no modification in the image.
dimensional secret messages. The proposed method Zhou et al. [21] proposed an image selection-based CIS
provides great convenience for constructing a large CID method using the mapping rule in 2015. They constructed a CID
and further improves the capacity. by collecting lots of natural images, and then the hash
⚫ An unsupervised clustering algorithm is adopted to sequences are generated from the image block pixels based on
construct the CID. The images with a large distance in the the hash algorithm. Next, the hash sequences and secret
feature domain can be selected to construct the CID, which messages are used to establish the mapping rule. Finally, the
can better resist image processing. It is helpful to improve cover images can be indexed from the CID by the mapping rule.
the robustness of the proposed CIS. Since then, many image selection-based CIS methods have
The rest of this paper is organized as follows: Related work been proposed. For example, Zheng et al. [22] proposed a CIS
is described in Section II. The proposed CIS is introduced in method based on image hashing. The orientation information of
Section III. Experimental results and analysis are given in the SIFT feature points is first adopted to calculate the image
Section IV. Finally, Section V concludes this paper. hash, which is used to guide to construct the CID. Then, the

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 3

image hash and secret message are used to establish the improving robustness. However, because of the application of
mapping rule. Finally, the cover image can be indexed from the the camouflage image, CI_CIS method needs to share the CID
CID by the mapping rule. Yuan et al. [23] proposed a CIS from the sender to the receiver, which aggravates the
method based on Bag-of-Features and SIFT. Firstly, Bag-of- communication pressure and brings the potential risk of CID
Features are used to cluster images, and then the SIFT features leakage during transmission. Besides, due to the relationship
of the image are extracted to get a hash sequence and guide to between the camouflage image and the cover image, this
construct the CID. Finally, the mapping rule between hash method also needs to build a multi-class CID and ensures that
sequence and secret message is established to index the cover the images in each category can hide all secret messages. Like
images from the CID. To further improve the capacity, Zou et LDA_DCT and DenseNet_DWT, the lengths of the image
al. [24] proposed a CIS method based on the average pixel feature and the secret message are also limited to be equal. Thus,
values of sub-images. The relationship between the average the difficulty of constructing the CID still exists.
pixel values of the sub-images is transformed into the hash To sum up, in the existing image selection-based CIS
sequence, which is guided to construct the CID. Then, the hash methods, how to reduce the difficulty of constructing the CID
sequence is used to map the secret message. Finally, the cover still needs to be solved. In this paper, we proposed an image
image can be indexed from the CID by the mapping rule. These selection-based CIS to construct the CID efficiently. Therefore,
image selection-based CIS methods make a certain contribution the discussed CIS method in the following parts of this paper is
in avoiding image modification and achieving a certain hiding related to the image selection-based CIS.
capacity. However, the robustness of these methods is limited,
and the successful construction of the CID is neglected. Besides, III. THE PROPOSED METHOD
the length of image features used in these methods is equal to
that of the secret message, which makes it more difficult to A. Overview
construct a large CID. Based on the consideration of the The framework of the proposed method is shown in Fig. 1.
practical application, many robust image selection-based CIS In Fig. 1, the part above the black dotted line is the process of
methods were proposed. At present, LDA_DCT [29], secret information hiding, and the part below the black dotted
DenseNet_DWT [30], and CI_CIS [31] are some of the most line is the process of secret information extraction. The
representative works. framework also includes five modules, i.e., deep hash
LDA_DCT: Zhang et al. proposed robust image selection- extraction, CID construction, mapping, secret information
based CIS via latent Dirichlet allocation (LDA) topic hiding, and secret information extraction.
classification and discrete cosine transform (DCT). In this Firstly, for a given original image dataset, the deep hashes
method, the LDA topic model is adopted to classify the image are extracted. Then clustering algorithm, i.e., K-Means
dataset based on Bag-of-Features. Then, the images belonging algorithm, is used to cluster these deep hashes into 𝐾 clusters.
to one topic are selected as candidate images. Finally, DCT is The images with the same deep hashes as 𝐾 clustering centroids
performed on these candidate images to construct a CID. The are indexed to construct the CID. Because the CID is also used
DCT coefficients are utilized as features for mapping secret to extract the secret information at the receiver side, a special
messages. Different from previous CIS methods, which way is designed that allows the sender and receiver to generate
construct the CID directly from the original image dataset, the same CID independently, instead of making the CID share
Zhang et al. creatively divided the CIS into two processes: between the sender and receiver. Finally, a mapping rule is
selecting the candidate images from the original image dataset established to map the deep hashes to secret messages. At the
and constructing the CID from the candidate images to map the sender side, the sender divides the secret information into
secret message. Zhang’s work is very groundbreaking and several secret messages and calculates the deep hashes
achieves high robustness. However, like the previous methods, corresponding to the secret messages according to the mapping
the lengths of the image feature and the secret message are rule firstly. Then, the images corresponding to the deep hashes
limited to be equal. Besides, to ensure that the images belonging are indexed from the CID as cover images. Finally, the cover
to one category can hide all secret messages, the number of images are sent to the receiver. At the receiving end, the
candidate images in each category needs to be enough, which receiver gets the cover images and uses the same feature
aggravates the difficulty of constructing the CID. extractor to extract the deep hashes. Then, it calculates the
DenseNet_DWT: Inspired by LDA_DCT, Liu et al. distance between these cover images and the CID in the feature
proposed image selection-based CIS according to DenseNet domain, and indexes the images with the greatest similarity
features and discrete wavelet transform (DWT). In Liu’s with these cover images from the CID. Finally, the images
method, DenseNet is used to retrieve the candidate images, indexed from the CID are mapped to secret messages according
which belong to one category. Then, DWT is performed on to the mapping rule, and these secret messages are connected to
these candidate images to construct the CID. Based on the recover the secret information.
DenseNet model and DWT features, Liu et al. further improved
B. Deep Hash Extraction
the robustness of the CIS. However, due to the idea like
LDA_DCT, it is still extremely difficult to construct the CID. In this paper, the DCMH-CNN is utilized as the feature
CI_CIS: Different from the previous methods that directly extractor to extract the deep hashes of images. The detailed
transmit the cover image to the receiver, Liu et al. provided an configuration of this CNN is shown in Table I. The first seven
idea of transmitting a similar camouflage image instead of the layers are the same as those in CNN-F [42]. For the eighth layer
cover image. This method avoids relying on robust mapping of this CNN structure, we set the dimension of the deep hash as
rules and provides a new research direction for further 𝑙 = 4096. That is to say, the final extracted deep hash is 4096-

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 4

… …
Generate . Construct
Original Feature Hash …
. Cluster .
Image Extractor . .
Dataset . K clustering centroids …
Coverless image dataset

Secret Divide Map Mapping Index


Sender Information .. Rule …
.
Cover images
Attack
Belong to 1010..110
Receiver Secret Connect Feature
Information .. Map .. 1001..010
.. Extractor …
. . .

Fig. 1. The framework of the proposed coverless image steganography.

All the values in 𝐹𝑥′𝑘 are between 0 and 1. Finally, the


TABLE I normalized deep feature 𝐹𝑥′𝑘 is converted to deep hash, which is
CONFIGURATION OF THE DCMH-CNN
denoted as
Layer Configuration
Conv1 f. 64×11×11; st. 4×4, pad 0, LRN, ×2 pool 𝐻𝑥𝑘 = {ℎ1 , ℎ2 , … , ℎ𝑙 }, (5)
Conv2 f. 265×5×5; st. 1×1, pad 2, LRN, ×2 pool
Conv3 f. 265×3×3; st. 1×1, pad 1
Conv4 f. 265×3×3; st. 1×1, pad 1 where
Conv5 f. 265×3×3; st. 1×1, pad 1, ×2 pool
Full6 4096 0, 𝑓𝑡′ ≤ 0.5
Full7 4096 ℎ𝑡 = { , 1 ≤ 𝑡 ≤ 𝑙. (6)
Full8 Hash code length 𝑙
1, 𝑓𝑡′ > 0.5

The above process is repeated to extract all deep hashes of


dimension. In addition, unlike DCMH, we use different the entire original image dataset. The extracted deep hashes of
functions to map the 4096-dimension deep feature extracted the entire original image dataset will be used to construct the
from the first seven layers to the 4096-dimension hash output CID later.
from the eighth layer.
For a given original dataset with 𝑚 images, which is denoted C. Coverless Image Dataset Construction
as In Section III.B, the deep hashes of the original image dataset
are extracted. In this section, the process of using these deep
𝑋 = {𝑥1 , 𝑥2 , … , 𝑥𝑚 }, (1) hashes to construct the CID is introduced. The K-Means
algorithm is adopted to cluster all deep hashes of the original
the deep feature of one image 𝑥𝑘 (1 ≤ 𝑘 ≤ 𝑚) can be extracted image dataset into 𝐾 clusters. Because the distance between the
firstly according to the first seven layers of DCMH-CNN, 𝐾 clustering centroids is relatively large, the image is selected
which is denoted as to construct the CID if its deep hash is the same as the clustering
centroid. Large distances between the images of the CID are
𝐹𝑥𝑘 = {𝑓1 , 𝑓2 , … , 𝑓𝑙 }. (2) helpful to bring out the good robustness of the corresponding
steganography method.
Then, the deep feature 𝐹𝑥𝑘 is normalized to 𝐹𝑥′𝑘 according to Compared with K-Means, K-Means++ algorithm designs a
Min-Max, which is shown as mechanism to select the initial clustering centroids. While the
first seed in this mechanism is selected randomly, resulting in
𝑓𝑡 −𝑓𝑚𝑖𝑛 the initial clustering centroids selected by K-Means++ being
𝐹𝑥′𝑘 = , 𝑡 = 1, 2, … , 𝑙. (3)
𝑓𝑚𝑎𝑥 −𝑓𝑚𝑖𝑛 random. Then, this randomness cannot guarantee that the CIDs
constructed after two separate K-Means++ operations are the
𝑓𝑚𝑖𝑛 is the minimum value and 𝑓𝑚𝑎𝑥 is the maximum value of same. As shown in Fig. 1, the CID is also used at the receiver.
all dimensions. It is worth mentioning that “all dimensions” Therefore, it is necessary to ensure that the receiver can get the
here refers to these feature values of all different dimensions in same CID. Under this condition, the K-Means++ algorithm is
one image, i.e., refer to {𝑓1 , 𝑓2 , … , 𝑓𝑙 }. After normalization, the not suitable for our project. In this paper, we design a way to
normalized deep feature can be obtained, which is denoted as initialize 𝐾 clustering centroids and define the same break
conditions at the sender and receiver. For the original image
𝐹𝑥′𝑘 = {𝑓1′ , 𝑓2′ , … , 𝑓𝑙′ }. (4) dataset 𝑋 , each image 𝑥𝑘 (1 ≤ 𝑘 ≤ 𝑚) has an 𝑙 -dimensional
deep hash 𝐻𝑥𝑘 . Firstly, the original image dataset is sorted

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 5

according to the value of the deep hashes. The sorted original


image dataset becomes Image Deep Deep Image Secret
ID hash hash ID information

0000,…,00
𝑋 ′ = {𝑥1′ , 𝑥2′ , … , 𝑥𝑚

}. (7)
0000,…,01
. . . . . . .
The corresponding deep hash 𝐻𝑥′ satisfies .
.
.
.
.
.
.
.
.
.
.
.
.
.
𝑘 Sort
… …
. . . . . . .
. . . .
𝑉(𝐻𝑥1′ ) < ⋯ < 𝑉 (𝐻𝑥′ ) < ⋯ < 𝑉(𝐻𝑥𝑚
′ ), (8) Coverless image .
. .
.
.
. . . . .
𝑘 dataset
1111,…,11

where 𝑉 (𝐻𝑥′ ) can be calculated by Fig. 2. The mapping rule.


𝑘

𝑉 (𝐻𝑥′ ) = ∑𝑙𝑡=1 ℎ𝑡 × 2𝑙−𝑡 . (9) Firstly, the CID and the corresponding deep hashes are sorted
𝑘
according to the values of deep hashes. Then the 𝐾 secret
Then, the original image dataset is divided into 𝐾 segments. messages are sorted according to the bit value, and the sorting
Each segment has 𝑛𝑢𝑚𝑛 images calculated by the formula result is shown in the rightmost column of Fig. 2. Finally, the
below. sorted hashes are one-to-one mapped to the secret messages.
Based on this mapping rule, the sender can index the cover
𝑚 images from the CID according to the secret messages. These
⌊ ⌋, 1 ≤ 𝑛 ≤ 𝐾 − 1 indexed cover images are denoted as
𝐾
𝑛𝑢𝑚𝑛 = { 𝑚
(10)
𝑚 − (𝐾 − 1) × ⌊ ⌋ , 𝑛 = 𝐾
𝐾 𝐶𝐼 = {𝑐𝑖1 , … , 𝑐𝑖𝑖 , … }, (14)
⌊∆⌋ returns the nearest integer value smaller than or equal to ∆,
where 𝑐𝑖𝑖 means 𝑖th cover image. At the same time, the receiver
𝑚 denotes the number of images in the original dataset. We can also recover the secret messages according to the extracted
assign ⌊𝑚/𝐾⌋ images to each of the first (𝐾 − 1) segments. deep hashes of the cover images. As mentioned earlier, the CID
Then the remaining images of the original image dataset are can be constructed separately by both the sender and the
assigned to the 𝐾-th segment. If 𝑛𝑢𝑚𝑛 is an odd number, the receiver. Therefore, the mapping rule can also be established
deep hash of the middle image is selected as the initial independently by the sender and the receiver, without sharing
clustering centroid. If 𝑛𝑢𝑚𝑛 is an even number, the deep hash the mapping rule from the sender to the receiver, thus reducing
of the image in front of the middle is selected as the initial the risk of being illegally obtained by a third party.
clustering centroid. In this way, 𝐾 initial clustering centroids
are determined as E. The Process of Secret Information Hiding
Secret information hiding is the procedure of selecting cover
𝐶0 = {𝑐𝑗 }𝐾
𝑗=1 ∈ 𝑅
𝐾×𝑙
. (11) images from the CID based on the secret messages. The details
of secret information hiding are as follows.
In addition, two break conditions of the K-Means algorithm a). For the original image dataset, the sender extracts the
are defined as follows: deep hashes 𝐻𝑥𝑘 based on DCMH-CNN as described in
a) Condition 1: Clustering centroids no longer change.
Section III.B.
b) Condition 2: The number of iterations is the maximum.
b). The CID is constructed from the original image dataset
Based on the two break conditions, the final 𝐾 clustering
based on K-Means as described in Section III.C.
centroids can be determined as
c). The mapping rule can be established as described in
Section III.D.
𝐶0′ = {𝑐𝑗′ }𝐾
𝑗=1 ∈𝑅 𝐾×𝑙
. (12) d). Suppose that there is secret information 𝑆 with length 𝑝.
The secret information is first partitioned into 𝑔 secret
Therefore, the sender and receiver can get the same 𝐾 messages. 𝑔 is calculated by
clustering centroids 𝐶0′ to construct the same CID without
sending the CID from sender to receiver, which avoids the risk
of being leaked in the transmission process. 𝑝⁄𝐿 , 𝑖𝑓 𝑝%𝐿 = 0
𝑔={ . (15)
𝑝⁄𝐿 + 1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
D. Establishment of the Mapping Rule
In this section, a mapping rule is designed to relate the CID If the length 𝑝 is not divided exactly by 𝐿, several zeros
to the secret message. As shown in Fig. 2, this mapping rule are added at the end of 𝑆 to guarantee the length of the
includes the CID, the corresponding 𝐾 deep hashes, and 𝐾 last secret message is 𝐿. And these zeros are recorded.
secret messages. The length of each secret message is 𝐿, which e). For each secret message (𝑠1 , 𝑠2 , …), the corresponding
is calculated by cover image is indexed from the CID according to the
mapping rule as described in Section III.D.
𝐿 = 𝑙𝑜𝑔2𝐾 . (13) f). These cover images, the zero-padding record and 𝐾
value are sent to the receiver in order. It is worth noting

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 6

that the zero-padding record and 𝐾 value are merged and d). For each cover image, the receiver extracts its deep hash
encrypted by the stream cipher algorithm. The secret according to the DCMH-CNN.
information hiding algorithm is described as Algorithm e). The distances between the deep hashes of cover images
1. and that of the CID are calculated according to
𝑑 = ||𝐻𝑥𝑘 − 𝐶0′ ||. (16)
Algorithm 1: Secret information hiding. f). The images, whose deep hashes are closest to that of
Input: Original image dataset 𝑋 = {𝑥1 , 𝑥2 , … , 𝑥𝑚 }; cover images, are indexed from the CID to map secret
Secret information 𝑆. messages (𝑠1 , 𝑠2 , …) according to the mapping rule.
Output: Cover images 𝐶𝐼 = {𝑐𝑖1 , … , 𝑐𝑖𝑖 , … }. g). Connecting these secret messages in order. Then
1. Extracting 𝐻𝑥𝑘 from 𝑋 based on DCMH-CNN. removing the redundant zeros according to the zero-
2. Constructing the CID based on K-Means. padding record to get the correct secret information 𝑆.
3. Establishing the mapping rule to relate the CID to The secret information extraction algorithm is described
the secret messages. as Algorithm 2.
4. Adding several zeros at the end of 𝑆 to guarantee
𝑆 can be divided exactly by 𝐿. IV. EXPERIMENTS
𝑔
5. Partitioning 𝑆 into {𝑠𝑡 }𝑡=1 with length 𝐿 A. Datasets and Implementation Details
6. For 𝑖 = 1 … 𝑔, do
The experiments are conducted on four popular image
7. Indexing 𝑐𝑖𝑖 via the mapping rule. datasets including INRIA Holidays [43], ImageNet [44],
8. End Caltech-256 [45], and NUS-WIDE [46]. The INRIA Holidays
9. Encrypting the zero-padding record and 𝐾 value dataset includes 1491 images, which all are used for
with stream cipher algorithm as side information experiments. The ImageNet dataset includes more than 15
𝑆𝐼. million images. A total of 2588 images of this dataset are
10. Sending 𝐶𝐼 and 𝑆𝐼 to the receiver. selected in files “n01440764” and “n01514668” for
experiments. The Caltech-256 dataset contains 30608 images
Algorithm 2: Secret information extraction. and 257 categories. In this paper, 903 images are selected from
“001.ak47”, “085.goat”, “086.golden gate-bridge”,
Input: Original image dataset 𝑋 = {𝑥1 , 𝑥2 , … , 𝑥𝑚 };
“087.measuring”, “089.goose”, “090.gorilla”, and “092.grapes”
Cover images 𝐶𝐼 = {𝑐𝑖1 , … , 𝑐𝑖𝑖 , … }; Side information
for experiments. The NUS-WIDE dataset is a real-world dataset
𝑆𝐼.
originally containing 269648 images. In this experiment,
Output: Secret information 𝑆.
120000 images are selected from the top 10 most frequent
1. Extracting the zero-padding record and 𝐾 value labels for experiments. These images selected from the above
from 𝑆𝐼 by stream cipher algorithm. original datasets are consistent with the comparison methods.
2. Reconstructing the same CID from 𝑋 via DCMH- To avoid confusion and facilitate subsequent description, the
CNN and K-Means. images selected from ImageNet, Caltech-256, and NUS-WIDE
3. For each 𝑐𝑖𝑖 , do datasets are recorded as ImageNet*, Caltech-256*, and NUS-
4. Extracting 𝐻𝑐𝑖𝑖 by DCMH-CNN. WIDE*, respectively.
5. Indexing the image whose deep hash is closest The dimension of deep hash 𝑙 and the maximum number of
to 𝐻𝑐𝑖𝑖 from the CID. K-Means iterations are set to 4096 and 100, respectively. In
6. Mapping this image to 𝑠𝑡 . Sections IV.C.1) and IV.C.2), the capacity is set to 𝐿 = 8, and
7. End the corresponding cluster number is 𝐾 = 256 . In Section
8. Connecting these 𝑠𝑡 in order. IV.C.3), the capacity is set to 𝐿 ∈ {1,2, … ,16} , and the
9. Removing the redundant zeros via the zero- corresponding cluster number is 𝐾 ∈ {21 , 22 , … , 216 } . In
padding record to get 𝑆. Holidays dataset, the image size is set to 512 × 512 . In
ImageNet* and Caltech-256* datasets, the image size is set to
128 × 128. In NUS-WIDE* dataset, the image size is set to
F. The Process of Secret Information Extraction 256 × 256 . The above image sizes are consistent with the
compared methods.
At the receiving end, the secret information 𝑆 can be
To verify the superiority of the proposed method, LDA_DCT
extracted based on the cover images, the zero-padding record,
[29], DenseNet_DWT [30], and CI_CIS [31] are adopted for
and 𝐾 value. The details of secret information extraction are
comparison. For LDA_DCT and DenseNet_DWT, we
as follows.
reproduced their experiments to test the robustness instead of
a). Decrypting the zero-padding record and 𝐾 value using their original data due to the difference in image selection.
according to the stream cipher algorithm. However, for CI_CIS, the image selection process plays a vital
b). The same CID as the sender can be reconstructed based role in improving its robustness. It is disadvantageous for
on the same original image dataset, DCMH-CNN, and K- CI_CIS that tests the robustness using the same images selected
Means. by our method. Therefore, we adopt the original data in CI_CIS,
c). The same mapping rule as the sender can be established rather than reproducing their experiment. It is worth noting that
based on the same CID. the image processing tested by CI_CIS is limited. We use ‘-’ to

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 7

represent the robustness under image processing not mentioned urgent problem to be solved. In this paper, the Holidays,
in CI_CIS. ImageNet*, Caltech-256*, and NUS-WIDE* datasets are used
to do the robustness analysis. The image processing used in the
TABLE II robustness experiments is listed in Table III, which is consistent
COMPARISON ON THE CAPACITY AND THE NUMBER OF COVER IMAGES
with LDA_DCT, DenseNet_DWT, and CI_CIS methods. Some
NEEDED FOR VARIOUS LENGTHS OF SECRET INFORMATION
images after being attacked are shown in Fig. 3. The original
𝑁𝑝 image of Fig. 3 is selected from the Caltech-256* dataset.
Algorithm Capacity
1B 10B 100B 1KB In this experiment, the accuracy of secret messages
LDA_DCT 2 7 55 548 15 extraction is used to represent the robustness, which is
DenseNet_DWT 2 7 55 548 15 consistent with the comparison methods. The accuracy of secret
CI_CIS 2 7 55 548 15
Our method 1 5 50 512 16
messages extraction is defined as

B. Capacity Analysis ∑𝑧𝑠𝑔=1 𝑓𝑒 (𝑏𝑠𝑠𝑔
′ )
′ 1, 𝑖𝑓 𝑏𝑠𝑠𝑔 = 𝑏𝑠𝑠𝑔
𝑆𝐴𝑅 = × 100%, 𝑓𝑒 (𝑏𝑠𝑠𝑔 )={ , (18)
𝑧 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
The capacity of the existing CIS depends on the number of
binary bits hidden in an image, that is, the length of the secret where 𝑧 is the number of secret messages, 𝑏𝑠𝑠𝑔 ′
is the sgth
message mentioned in this paper. In our method, 𝐿 means extracted secret message and 𝑏𝑠𝑠𝑔 is the sgth original secret
capacity, and it depends on the number of K-Means clusters, ′
message. When 𝑏𝑠𝑠𝑔 = 𝑏𝑠𝑠𝑔 , 𝑓𝑒 (∙) is 1; Otherwise 𝑓𝑒 (∙) is 0.
that is 𝐾 . According to the NUS-WIDE* dataset, 𝐾 can be
selected from {21 , 22 , … , 216 }. At this time, 𝐿 from 1 to 16 can
TABLE III
be calculated by (13). It should be pointed out that if the original THE IMAGE PROCESSING USED IN THE COMPARISON EXPERIMENT WITH
image dataset is larger and the computational power is stronger, LDA_DCT [29], DenseNet_DWT [30], AND CI_CIS [31]
𝐿 can increase in theory. For the same length of secret
information 𝑆, the larger the 𝐿, the less the number of cover Image processing The specific parameters
images needed. To hide secret information 𝑆 with length 𝑝, the JPEG compression The quality factors 𝑄: 10%, 50%, 90%
Gauss noise The mean 𝜇: 0, the variance 𝜎: 0.001, 0.005
number of cover images needed is Salt and pepper noise The mean 𝜇: 0, the variance 𝜎: 0.001, 0.005
Speckle noise The mean 𝜇: 0, the variance 𝜎: 0.01, 0.05
𝑝 Gauss low-pass filtering The window size: 3 × 3
𝑁𝑝 = ⌈ ⌉, (17)
𝐿 Mean filtering The window size: 3 × 3
Median filtering The window size: 3 × 3
where ⌈∆⌉ returns the nearest integer value greater than or equal Centered cropping Ratio: 20%, 50%
to ∆. Edge cropping Ratio: 10%, 20%
Different values can be set for 𝐿 in both the proposed method Rotation Rotation angles: 10°, 30°, 50°
and the comparison methods. In this section, for the Translation In Holidays: (80, 50), (160, 100), (320, 200)
convenience of presentation, the maximum value of 𝐿 as the In ImageNet*, Caltech-256*, and
capacity is selected for comparison, and the relationship NUS-WIDE*: (16, 10), (32, 20), (40, 25)
Scaling Ratio: 0.5, 0.75, 1.5, 3
between the number of cover images and the length of secret Color histogram None
information of different methods is also compared. These equalization
comparison results are shown in Table II. The ‘B’ in Table II Gamma correction Factor: 0.8
represents a byte. 1B equals 8 bits. We give out the comparison
on the minimum numbers of cover images required when 1B,
1) Robustness Comparison
10B, 100B, and 1KB secret information needs to be hidden
The Holidays, ImageNet*, and Caltech-256* datasets are
under the corresponding capacities listed in the rightmost
used to test the robustness compared with LDA_DCT,
column. It is clear from Table II that our method improves the
DenseNet_DWT, and CI_CIS methods, which are consistent
theoretical capacity by 6.67% as compared with other methods.
with the datasets in the comparison methods. All the images of
Besides, an additional image is needed to represent the position
the CID constructed from the Holidays, ImageNet*, and
information in these comparison methods. Therefore, our
Caltech-256* datasets are used to hide secret messages. These
method needs fewer cover images than the comparison methods
images are processed by image processing to test their
when hiding the same length of secret information. In addition
robustness. The experimental results are shown in Tables IV to
to the theoretical capacity, we list the comparison on the actual
VI. It is worth mentioning that some of the image processing
capacity on the experimental image datasets (see Table VIII). It
names in Table IV to VI are abbreviated for convenience.
can be seen from Table VIII that the proposed method greatly
From Table IV to VI, the proposed method achieves the
improves the actual capacities on the corresponding image
optimal robustness in three datasets against most image
datasets compared with the state-of-the-art methods.
processing. Specifically, compared with LDA_DCT and
C. Robustness Analysis DenseNet_DWT methods, it is obvious that our method gets
When the cover images are sent to the receiver, the cover better robustness, especially against geometric processing such
images are likely to be attacked intentionally or unintentionally as cropping, rotation, and translation, where the LDA_DCT and
by some image processing algorithms, such as JPEG DenseNet_DWT methods are sensitive to. Compared with low-
compression, Gaussian noise, etc. In CIS, how to resist these level features such as DCT and DWT, deep hash carries the
image processing, that is, how to improve the robustness is an semantic information of the image. The geometric processing

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 8

(a) (b) (c) (d) (e) (f) (g)

(h) (i) (j) (k) (l) (m) (n)


Fig. 3. Some samples of attacked images in Caltech-256*. (a) JPEG compression: Q = 90. (b) Gauss noise: μ = 0 and σ = 0.001. (c) Salt and pepper noise: μ =
0 and σ = 0.001. (d) Speckle noise: μ = 0 and σ = 0.01. (e) Gauss low-pass filtering: 3 × 3. (f) Mean filtering: 3 × 3. (g) Median filtering: 3 × 3. (h) Centered
cropping: 20%. (i) Edge cropping: 10%. (j) Rotation: 10°. (k) Translation: (16, 10). (l) Scaling: 1.5. (m) Color histogram equalization. (n) Gamma correction:
0.8.

with smaller parameters will not destroy the semantic TABLE V


ROBUSTNESS COMPARISON WITH LDA_DCT [29], DenseNet_DWT [30], AND
expression of the image (see (h), (i), (j), and (k) in Fig. 3), thus CI_CIS [31] IN ImageNet* DATASET
deep hash can achieve stronger robustness against geometric
processing than the low-level features. Moreover, the clustering Processing Size LDA_DCT DenseNet_DWT CI_CIS Proposed
algorithm is adopted to construct the CID in this manuscript, JPEG 𝑄(10) 91.4% 98.0% 100.0% 99.2%
which increases the distance between these images in the 𝑄(50) 98.4% 98.0% - 100.0%
feature domain, so as to further improve the robustness against 𝑄(90) 99.6% 100.0% - 100.0%
Gauss-N 𝜎(0.001) 94.1% 98.8% 85.0% 100.0%
image processing. CI_CIS is one of the latest and state-of-the-
𝜎(0.005) 94.1% 98.0% - 98.4%
art steganography methods. With the help of camouflage image S&P-N 𝜎(0.001) 98.4% 100.0% 88.0% 100.0%
transmission, it can also achieve the same optimal robustness as 𝜎(0.005) 94.1% 98.8% - 100.0%
our method against most processing. However, for noise Speckle-N 𝜎(0.01) 99.6% 99.2% 85.0% 99.6%
processing, our method is obviously the better one. In addition, 𝜎(0.05) 89.8% 91.0% - 91.8%
Gauss-F (3 × 3) 99.2% 100.0% 100.0% 100.0%
using the camouflage image in CI_CIS method greatly Mean-F (3 × 3) 98.8% 100.0% 100.0% 100.0%
increases the difficulty of constructing a CID, which will be Median-F (3 × 3) 94.5% 95.7% 100.0% 100.0%
compared in the following parts. Centered-C 20% 68.8% 20.3% 84.0% 98.4%
50% 11.3% 3.5% - 78.5%
TABLE IV Edge-C 10% 18.8% 55.5% - 99.6%
ROBUSTNESS COMPARISON WITH LDA_DCT [29], DenseNet_DWT [30], AND 20% 6.3% 28.5% 100.0% 97.3%
CI_CIS [31] IN Holidays DATASET Rotation 10° 66.4% 36.7% 100.0% 100.0%
30° 8.2% 5.9% - 93.8%
50° 5.1% 2.3% - 78.5%
Processing Size LDA_DCT DenseNet_DWT CI_CIS Proposed Translation (80, 50) 21.1% 36.7% - 99.6%
JPEG 𝑄(10) 93.0% 96.9% 100.0% 97.7% (160, 100) 8.6% 9.0% - 99.2%
𝑄(50) 98.8% 98.8% - 100.0% (320, 200) 6.3% 5.9% - 97.7%
𝑄(90) 100.0% 100.0% - 100.0% Scaling 0.5 97.7% 92.2% - 100.0%
Gauss-N 𝜎(0.001) 94.1% 96.1% 46.0% 99.2% 0.75 96.9% 94.9% - 100.0%
𝜎(0.005) 91.4% 95.7% - 96.1% 1.5 99.2% 98.8% - 100.0%
S&P-N 𝜎(0.001) 98.0% 99.2% 46.0% 99.6% 3 98.8% 100.0% 100.0% 100.0%
𝜎(0.005) 92.2% 97.7% - 98.8% C-H-E 73.4% 71.1% 100.0% 94.9%
Speckle-N 𝜎(0.01) 93.0% 96.1% 46.0% 98.0% Gamma-C 0.8 89.8% 94.5% 100.0% 100.0%
𝜎(0.05) 88.3% 91.0% - 91.4%
Gauss-F (3 × 3) 100.0% 100.0% 100.0% 100.0%
Mean-F (3 × 3) 100.0% 100.0% 100.0% 100.0%
Median-F (3 × 3) 100.0% 100.0% 100.0% 100.0%
Centered-C 20% 31.1% 19.5% 93.0% 98.0%
50% 9.8% 5.9% - 78.1%
Edge-C 10% 26.2% 46.1% - 94.9%
20% 18.0% 21.1% 100.0% 82.8%
Rotation 10° 39.8% 38.3% 92.0% 96.9%
30° 4.7% 3.9% - 68.0%
50° 0.0% 0.0% - 50.0%
Translation (80, 50) 41.8% 37.5% 100.0% 99.2%
(160, 100) 19.9% 10.9% - 95.3%
(320, 200) 4.7% 3.9% - 92.2%
Scaling 0.5 100.0% 100.0% - 100.0%
0.75 100.0% 100.0% - 100.0%
1.5 100.0% 100.0% - 100.0%
3 100.0% 100.0% 100.0% 100.0% Fig. 4. The average robustness results under various image processing methods
C-H-E 73.8% 85.9% 95.0% 91.0% in Holidays, ImageNet*, and Caltech-256* datasets.
Gamma-C 0.8 94.1% 98.0% 100.0% 100.0%

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 9

(a) Holidays (b) ImageNet* (c) Caltech-256*


Fig. 5. The average Manhattan distance between the final clustering centroids in three datasets.

TABLE VI indicates that the fixed initial clustering centroid 𝐶0 has no


ROBUSTNESS COMPARISON WITH LDA_DCT [29], DenseNet_DWT [30], AND
CI_CIS [31] IN Caltech-256* DATASET
negative effect on the robustness of the proposed method.
To further analyze the internal reasons, we calculated the
Processing Size LDA_DCT DenseNet_DWT CI_CIS Proposed average Manhattan distance between the final clustering
JPEG 𝑄(10) 91.0% 95.7% 93.0% 98.4% centroids {𝐶0′ , 𝐶01 ′ ′
, 𝐶02 ′
, … , 𝐶010 } after K-Means based on
𝑄(50) 96.1% 98.0% - 100.0% {𝐶0 , 𝐶01 , 𝐶02 , … , 𝐶010 }. The average distance results on three
𝑄(90) 97.7% 98.8% - 100.0% datasets are shown in Fig. 5. It can be seen from Fig. 5 that the
Gauss-N 𝜎(0.001) 88.7% 89.1% 73.0% 100.0%
𝜎(0.005) 84.4% 87.9% - 98.0%
average distance is basically in a flat trend. That is, the effect of
S&P-N 𝜎(0.001) 90.6% 87.9% 71.0% 100.0% different initial clustering centroids on the distances between
𝜎(0.005) 86.7% 87.1% - 99.6% final clustering centroids is very slight, which shows that the
Speckle-N 𝜎(0.01) 90.6% 88.3% 73.0% 99.6% distances of the final clustering centroids are robust to the
𝜎(0.05) 89.5% 89.8% - 91.8%
selection of the initial clustering centroids. Therefore, different
Gauss-F (3 × 3) 98.8% 100.0% 93.0% 100.0%
Mean-F (3 × 3) 99.2% 100.0% 93.0% 98.4% groups of CIDs constructed based on different initial clustering
Median-F (3 × 3) 80.9% 93.0% 100.0% 100.0% centroids lead to similar robustness results. In other words, the
Centered-C 20% 49.2% 6.3% 91.0% 100.0% CID constructed by using the initial clustering centroids 𝐶0
50% 5.9% 3.5% - 80.5%
Edge-C 10% 34.8% 42.2% - 99.2%
designed in this paper has no negative impact on the decode of
20% 21.5% 19.1% 94.0% 95.3% the proposed CIS. Thus, to sum up, the proposed method of
Rotation 10° 10.9% 9.0% 94.0% 100.0% constructing the CID is robust.
30° 0.8% 1.2% - 87.1%
50° 3.9% 2.3% - 67.6%
Translation (80, 50) 18.4% 19.9% - 99.6%
3) The Robustness under Various Capacities
(160, 100) 4.3% 4.3% - 98.4% As described in Section IV.B, we can set the capacity ranges
(320, 200) 3.5% 2.0% - 97.3% from 1 to 16 based on the NUS-WIDE* dataset. To verify the
Scaling 0.5 93.4% 99.2% - 100.0% effectiveness of our method, we calculated the average
0.75 93.8% 96.1% - 100.0%
1.5 98.0% 98.8% - 100.0% robustness results under all image processing methods for
3 99.6% 100.0% 93.0% 100.0% NUS-WIDE* dataset with a capacity from 1 to 16.
C-H-E 60.5% 68.0% 94.0% 91.8% As we all know, 𝑆𝐴𝑅 is the measurement on the level of
Gamma-C 0.8 90.2% 92.6% 94.0% 100.0% secret messages, while bit accuracy rate (𝐵𝐴𝑅) is one the level
of secret bits. Therefore, we adopt 𝐵𝐴𝑅 in this section to
2) The Effect of 𝐶0 Selection on Robustness provide one more vision to demonstrate the robustness of the
In this paper, to construct the same CID independently at the proposed method. The experimental results are shown in Fig. 6.
sender and the receiver, the initial clustering centroids 𝐶0 are Overall, except for the capacity of 2, the robustness decreases
fixed according to the way described in Section III.C. It is well with the increase of capacity. When the capacity is 16, the
known that different 𝐶0 lead to different final clustering minimum average accuracy of secret bits recovery is 98.1%.
centroids 𝐶0′ . Thus, the CID is also different, which may bring When the capacity is 2, there are only 4 images in the CID, and
different robustness results. To measure the effect of 𝐶0 on the total secret messages that can be represented are 8 bits.
robustness, we randomly select 10 additional initial clustering Therefore, when only 1 bit is wrong, the final average recovery
centroids, which are denoted as {𝐶01 , 𝐶02 , … , 𝐶010 }, and conduct accuracy will be greatly affected. To sum up, the average
K-Means based on {𝐶01 , 𝐶02 , … , 𝐶010 } to construct different robustness of our method is very high under the capacity ranges
CIDs in Holidays, ImageNet*, and Caltech-256* datasets. Then, from 1 to 16, and all of the average recovery accuracies are
the robustness results of these 10 groups of CIDs are calculated greater than 98%.
respectively. All the image processing in Table III is used in D. Analysis of Coverless Image Dataset Construction
this section. We test the robustness under various image
processing methods independently and then calculate the The construction of a CID is very important, as it has a
average robustness results. The average robustness results critical impact on the capacity. Simultaneously, the
under various image processing methods in three datasets are construction of a CID is also very difficult according to
shown in Fig. 4. It can be seen that the influence of different previous CIS methods. When the capacity of CIS is 𝐿 , the
initial clustering centroids on robustness is slight. It further required number of images should be at least 2𝐿 . More
importantly, it must be ensured that the constructed CID
contains 2𝐿 different features to map with all secret messages.

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 10

corresponding dataset. It should be noted that in order to


simplify the experimental steps and fully verify the advantages
of the proposed method, we assume in the following
experiments that LDA_DCT, DenseNet_DWT, and CI_CIS do
not classify the original image dataset, i.e., 𝑛 = 1. It can be seen
from Table VII that this assumption greatly reduces the
difficulty of these comparison methods in constructing the CID.
From Table VIII, the maximum capacities under the Holidays
dataset of the three comparison methods are 5. It means that
LDA_DCT, DenseNet_DWT, and CI_CIS can construct a CID
with a maximum capacity of 5 based on the Holidays dataset.
Fig. 6. The average robustness results under all image processing algorithms Similarly, for ImageNet*, Caltech-256*, and NUS-WIDE*
in NUS-WIDE* dataset. datasets, these comparison methods can construct a CID with a
maximum capacity of 6, 5, and 10, respectively. While, for
Actually, the required number of images is generally much Holidays, ImageNet*, Caltech-256*, and NUS-WIDE* datasets,
greater than 2𝐿 . It is hard to construct a CID with a large our method can construct a CID with a maximum capacity of
capacity according to state-of-the-art methods. 10, 11, 9, and 16, respectively, which is much higher than the
comparison methods. From Table VIII, the maximum
1) CID Construction Ability Comparison capacities obtained by the comparison methods under the same
In this section, we analyzed the ability of the proposed dataset are the same. The results in Table VIII are related to the
method to construct a CID and compared it with LDA_DCT, size of the given image dataset and the features used to map
DenseNet_DWT, and CI_CIS. The comparison results are secret messages. In these comparison methods, the lengths of
shown in Tables VII and VIII, and Fig. 7. Table VII shows the these features are equal to that of the secret messages. Thus, the
minimum number of images required when the capacity is 𝐿, numbers of the types of secret messages represented by these
which is recorded as 𝐴𝐼 . In the proposed method, only 2𝐿 features are the same. Therefore, it is most likely the same
between the maximum capacities represented by these
images are required to hide all the 𝐿-bit secret messages. While,
comparison methods under the same dataset. The similarities of
for LDA_DCT, DenseNet_DWT, and CI_CIS, the minimum
these comparison methods in Fig. 7 can verify this viewpoint.
number of images is 𝑛 × 2𝐿 . In these methods, they first
classify the original image dataset. For each category, it needs
to contain images that can hide all 2𝐿 secret messages, 𝑛
represents the number of categories. Therefore, these methods
need at least 𝑛 × 2𝐿 images to hide all the 𝐿-bit secret messages.
In LDA_DCT and DenseNet_DWT, the value of 𝑛 is not given.
In CI_CIS method, 𝑛 is set to 50. Compared with the above-
mentioned methods, the proposed algorithm can greatly reduce
the minimum number of images required under the same
capacity.

TABLE VII
COMPARISON RESULTS ON MINIMUM NUMBER OF IMAGES (𝐴𝐼 )
Fig. 7. The required numbers of images under the various capacities in NUS-
Method LDA_DCT DenseNet_DWT CI_CIS Proposed WIDE* dataset.
𝐴𝐼 𝑛 × 2𝐿 𝑛 × 2𝐿 𝑛 × 2𝐿 2𝐿
To analyze the number of required images changed with the
TABLE VIII capacity increase, we adopt the NUS-WIDE* dataset to explore
THE MAXIMUM CAPACITY REPRESENTED BY CORRESPONDING CID
CONSTRUCTED BASED ON FOUR DATASETS
the upward trend of different methods, which is shown in Fig.
7. The horizontal axis of Fig. 7 represents the capacity, and the
Holidays ImageNet* Caltech-256* NUS-WIDE* vertical axis represents the actual number of images required
LDA_DCT 5 6 5 10 under the corresponding capacity. According to the feature
DenseNet_DWT 5 6 5 10 extraction algorithms of different CIS methods, we randomly
CI_CIS 5 6 5 10 select images one by one from NUS-WIDE* dataset and record
Ours 10 11 9 16
the secret message represented by the selected image. When the
randomly selected images can hide all 2𝐿 secret messages, the
To verify the effectiveness of the proposed method in number of images is recorded. It should be noted that to reduce
constructing the CID, we test the maximum capacity the occasionality caused by a single random selection process,
represented by the corresponding CID constructed based on we repeat each process 10 times and calculate the average value
Holidays, ImageNet*, Caltech-256*, and NUS-WIDE* datasets, as the final result. As shown in Fig. 7, our method needs the
which is shown in Table VIII. The data in Table VIII indicates minimum number of images at any capacity. The number of
the maximum capacity that can be represented under the images needed by LDA_DCT, DenseNet_DWT, and CI_CIS

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 11

increases sharply with the increase of capacity. Especially, our method can improve the utilization rate of Holidays,
when the capacity is 10, these comparison methods all need to ImageNet*, Caltech-256*, and NUS-WIDE* datasets by 32, 32,
index more than 90000 images to hide all 210 secret messages. 16, and 64 times, respectively.
In contrast, our method only needs 1024 images to hide all 210 The dataset utilization rate is determined by the number of
secret messages. When the capacity is greater than 10, these images in the given image dataset and the number of images in
comparison methods cannot index enough images to hide all the maximum CID constructed based on the given image
secret messages, which is very challenging in practical dataset. That is to say, the dataset corresponding to the optimal
applications. However, the number of images needed by our results can be different according to these factors. As shown in
method increases gently with the capacity increases, which Fig. 7, the comparison methods have the similar relationship
greatly reduces the difficulty of constructing the CID. between the number of images in datasets and the capacity of
Especially when the capacity is large, our method can easily the steganography method. However, the proposed method is
construct a CID compared with the state-of-the-art methods. quite different from these three methods in the relationship. It
can be seen from Fig.7 that the proposed method can construct
2) Dataset Utilization Rate a much higher capacity than the comparison methods with the
It can be seen from Section IV.D.1) that the proposed method same number of images. Therefore, the proposed method
is easier to construct a CID with a larger capacity under the reaches the optimal dataset utilization rate under the ImageNet*
same given image dataset compared with the comparison dataset while the comparison methods reach the optimal dataset
methods. For a given image dataset, the CID with a larger utilization rates under the Caltech-256* dataset.
capacity means that more images will be indexed from the given The maximum dataset utilization rate varies with the given
image dataset. Thus, the utilization of the given image dataset image datasets. For the Holidays dataset, it can be constructed
is measured by the dataset utilization rate, which is defined as theoretically to a corresponding maximum CID with 1024
images. Then, the maximum dataset utilization rate of the
𝐴𝐼 Holidays dataset can be calculated as 68.68%. Similarly, for the
𝛾= × 100%. (19)
𝐴𝑋 ImageNet*, Caltech-256*, and NUS-WIDE*, the maximum
dataset utilization rates are 79.13%, 56.70%, and 54.61%,
From Section IV.D.1), 𝐴𝐼 expresses the minimum number of respectively. Our method has reached the maximum dataset
images required to construct the CID with the corresponding utilization rate on these datasets.
capacity. Here, 𝐴𝐼 is only taken as the minimum number of
images required to construct the CID with the actual maximum E. Anti-Steganalysis Analysis
capacity under a given image dataset. For example, if a CID There are many existing steganalysis algorithms, such as
with an actual maximum capacity of 8 can be constructed under Ensemble Classifier [47], SRM [48], XuNet [49], and YeNet
a given image dataset, then 𝐴𝐼 should be 256. 𝐴𝑋 denotes the [50], to judge whether there is a secret message in the cover
number of total images in the given dataset 𝑋. image by the modification traces. Generally, the more secret
bits embedded in the image, the more obvious the image
TABLE IX modification traces, and the easier the existence of the secret
DATASET UTILIZATION RATE
message is detected by steganalysis algorithms. However, in the
proposed method, the cover image is just indexed by the
Holidays ImageNet* Caltech-256* NUS-WIDE*
LDA_DCT 2.15% 2.47% 3.54% 0.85%
mapping rule based on the secret message. In such a mapping
DenseNet_DWT 2.15% 2.47% 3.54% 0.85% process, the cover image will not be modified, and no
CI_CIS 2.15% 2.47% 3.54% 0.85% modification trace will be left in the cover image. Therefore, the
Ours 68.68% 79.13% 56.70% 54.61% cover image without any modification can resist steganalysis
Improvement
(times)
32 32 16 64 detection fundamentally.

V. CONCLUSION
Holidays, ImageNet*, Caltech-256*, and NUS-WIDE*
datasets are adopted to analyze the utilization rate. According In this paper, a coverless image steganography method based
to Section IV.A, 𝐴𝑋 of Holidays, ImageNet*, Caltech-256*, and on efficient coverless image dataset construction is proposed. A
NUS-WIDE* datasets are 1491, 2588, 903, and 120000, CID construction method based on unsupervised clustering is
respectively. According to Section IV.D.1), under the same designed, which is easier to construct the CID under different
dataset, the actual maximum capacities obtained by LDA_DCT, capacities. The same CID can be constructed independently at
DenseNet_DWT, and CI_CIS are consistent. Therefore, 𝐴𝐼 both the sender and receiver, which avoids the communication
generated by the three methods under Holidays, ImageNet*, pressure of transmitting the large CID and the potential risk of
Caltech-256*, and NUS-WIDE* datasets are 32, 64, 32, and information leakage. The high-dimensional deep hash of the
1024, respectively. While 𝐴𝐼 generated by the proposed method image is related to the low-dimensional secret message via a
under Holidays, ImageNet*, Caltech-256*, and NUS-WIDE* designed mapping rule, which lays a foundation for the
datasets are 1024, 2048, 512, and 65536, respectively. Thus, the successful construction of a large CID that can represent a high
dataset utilization rate can be calculated, as shown in Table IX. capacity. The images indexed by the unsupervised clustering
From Table IX, the proposed method can greatly improve the algorithm usually have low similarity in the feature domain,
dataset utilization rate compared with the state-of-the-art which is conducive to high robustness. Experimental results
methods. Specifically, compared with the comparison methods, demonstrate that the proposed method has better robustness in

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 12

resisting most image processing and high efficiency in [22] S. Zheng, L. Wang, B. Ling and D Hu, “Coverless information hiding
based on robust image hashing,” in Proc. Int. Conf. Intell. Comput.,
constructing the CID compared with the state-of-the-art
Liverpool, U.K., 2017, pp. 536–547.
methods. [23] C. Yuan, Z. Xia, X. Sun, “Coverless image steganography based on SIFT
and BOF,” J. Int. Technol., vol. 18, no. 2, pp. 435–442, Feb. 2017.
ACKNOWLEDGMENT [24] L. Zou, J. Sun, M. Gao, W. Wan and B. B. Gupta, “A novel coverless
information hiding method based on the average pixel value of the sub-
Thanks to Dr. Farhad Pourpanah, postdoc at the University images,” Multimedia Tools Appl., vol. 78, no. 7, pp. 7965-7980, Apr. 2019.
of Windsor, for helping us improve the English writing. [25] Z. Zhou, Y. Mu. and Q.M.J, “Coverless image steganography using
partial-duplicate image retrieval,” Soft Comput., vol. 23, no. 13, pp. 4927–
4938. July 2019.
REFERENCES [26] X. Chen, Y. Qiu, A. Q, X. Sun, S. Wang and G. Wei, “A high-capacity
[1] X. Zhang, “Reversible data hiding with optimal value transfer,” IEEE coverless image steganography method based on double-level index and
Trans. on Multimedia, vol. 15, no. 2, pp. 316–325, Feb. 2013. block matching,” Math. Biosci. Eng., vol. 16, no. 5, pp. 4708-4722, May
[2] P. V. K. Borges, J. Mayer and E. Izquierdo, "Robust and Transparent Color 2019.
Modulation for Text Data Hiding," IEEE Trans. Multimedia, vol. 10, no. [27] Y. Luo, J. Qin, X. Xiang and Y. Tan, "Coverless Image Steganography
8, pp. 1479-1489, Dec. 2008. Based on Multi-Object Recognition," IEEE Trans. Circuits Syst. Video
[3] W. Zhang, H. Wang, D. Hou and N. Yu, "Reversible Data Hiding in Enc- Technol., vol. 31, no. 7, pp. 2779-2791, July 2021.
rypted Images by Reversible Image Transformation," IEEE Trans. Multi- [28] Z. Zhou, Y. Cao, M. Wang, E. Fan and Q. M. J. Wu, "Faster-RCNN Based
media, vol. 18, no. 8, pp. 1469-1479, Aug. 2016. Robust Coverless Information Hiding System in Cloud Environment,"
[4] Y. Erfani, R. Pichevar and J. Rouat, "Audio Watermarking Using Spikegr- IEEE Access, vol. 7, pp. 179891-179897, Nov. 2019.
am and a Two-Dictionary Approach," IEEE Trans. Inf. Forensics Secur., [29] X. Zhang, F. Peng and M. Long, "Robust Coverless Image Steganography
vol. 12, no. 4, pp. 840-852, Apr. 2017. Based on DCT and LDA Topic Classification," IEEE Trans. Multimedia,
[5] T. Stutz, F. Autrusseau, and A. Uhl, “Non-blind structure-preserving sub- vol. 20, no. 12, pp. 3223-3238, Dec. 2018.
stitution watermarking of H.264/CAVLC inter-frames,” IEEE Trans. Mul- [30] Q. Liu, X. Xiang, J. Qin, Y. Tan, J. Tan and Y. Luo, “Coverless steganog-
timedia, vol. 16, no. 5, pp. 1337–1349, May 2014. raphy based on image retrieval of DenseNet features and DWT sequence
[6] S. C. Liu and W. H. Tsai, “Line-based cubism-like image—A new type of mapping,” Knowledge-Based Syst., vol. 192, pp. 105375, Mar. 2020.
art image and its application to lossless data hiding,” IEEE Trans. Inf. [31] Q. Liu, X. Xiang, J. Qin, Y. Tan and Q. Zhang, "A Robust Coverless
Forensics Secur., vol. 7, no. 5, pp. 1448–1458, May 2012. Steganography Scheme Using Camouflage Image," IEEE Trans. Circuits
[7] W. Luo, F. Huang and J. Huang, "Edge Adaptive Image Steganography Syst. Video Technol., doi: 10.1109/TCSVT.2021.3108772.
Based on LSB Matching Revisited," IEEE Trans. Inf. Forensics Secur., [32] Q. Jiang and W. Li, "Deep Cross-Modal Hashing," in Proc. IEEE Conf.
vol. 5, no. 2, pp. 201-214, June 2010. Comput. Vis. Pattern Recognit., Hawaii, USA, 2017, pp. 3270-3278.
[8] D. C.Wu and W. H. Tsai, “A steganographic method for images by pixel [33] M. Liu, M. Zhang, J. Liu, Y. Zhang and Y. Ke, “Coverless information
value differencing,” Pattern Recognit. Lett., vol. 24, no. 9, pp. 1613–1626, hiding based on generative adversarial networks,” arX. Prepr.,
Sept. 2003. arXiv:1712.06951, 2017.
[9] Z. Ni, Y. Shi, N. Ansari and W. Su, “Reversible data hiding,” IEEE Trans. [34] X. Duan and H. Song, “Coverless information hiding based on generative
Circuits Syst. Video Technol., vol. 16, no. 3, pp. 354-362, Mar. 2006. model,” arX. Prepr., arXiv:1802.03528, 2018.
[10] K. Joshi and R. Yadav, "A new LSB-S image steganography method blend [35] K. Yang, K. Chen, W. Zhang and N. Yu, “Provably secure generative
with Cryptography for secret communication," in Proc. Int. Conf. Image steganography based on autoregressive model,” in Proc. Int. Workshop
Inf. Process., Solan, India, 2015, pp. 86-90. Digit. Watermarking. Springer, Cham, 2018, pp. 55-68.
[11] V. Sharma and R. Bhardwaj, “A lossless Data Hiding method based on [36] X. Duan, B. Li, D. Guo, Z. Zheng and Y Ma, “A coverless steganography
inverted LSB technique,” in Proc. Int. Conf. Image Inf. Process., Solan, method based on generative adversarial network,” EURASIP J. Image
India, 2015, pp. 486-490. Video Proc., vol. 2020, no. 1, pp. 1-10, 2020.
[12] J. Mielikainen, "LSB matching revisited," IEEE Signal Process. Lett., vol. [37] Y. Cao, Z. Zhou, Q.M.J Wu, C. Yuan and X. Sun, “Coverless information
13, no. 5, pp. 285-287, May 2006. hiding based on the generation of anime characters,” EURASIP J. Image
[13] V. Sedighi, R. Cogranne and J. Fridrich, "Content-Adaptive Steganogra- Video Proc., vol. 36, no. 1, pp. 1-15, Sept. 2020.
phy by Minimizing Statistical Detectability," IEEE Trans. Inf. Forensics [38] Q. Li, X. Wang, X. Wang, B. Ma and Y. Shi, “An encrypted coverless
Secur., vol. 11, no. 2, pp. 221-234, Feb. 2016. information hiding method based on generative models,” Inf. Sciences,
[14] W. Tang, B. Li, S. Tan, M. Barni and J. Huang, "CNN-Based Adversarial vol.553, no. 3, pp. 19-30, 2021.
Embedding for Image Steganography," IEEE Trans. Inf. Forensics Secur., [39] X. Chen, Z. Zhang, A. Qiu, Z. Xia and N. Xiong. 2020. A novel coverless
vol. 14, no. 8, pp. 2074-2087, Aug. 2019. steganography method based on image selection and StarGAN. IEEE
[15] I. J. Cox, J. Kilian, F. T. Leighton and T. Shamoon, "Secure spread spectr- Trans. Netw. Sci. Eng., vol. 9, no. 1, pp. 219-230, 1 Jan.-Feb. 2022.
um watermarking for multimedia," IEEE Trans. Image Process., vol. 6, no. [40] F. Peng, G. Chen and M. Long, “A Robust Coverless Steganography Based
12, pp. 1673-1687, Dec. 1997. on Generative Adversarial Networks and Gradient Descent
[16] W. Lin, S. Horng, T. Kao, P. Fan, C. Lee and Y. Pan, "An Efficient Water- Approximation,” IEEE Trans. Circuits Syst. Video Technol., doi:
marking Method Based on Significant Difference of Wavelet Coefficient 10.1109/TCSVT.2022.3161419.
Quantization," IEEE Trans. Multimedia, vol. 10, no. 5, pp. 746-757, Aug. [41] Z. Zhou, Y. Su, Q.M.J. Wu, Z. Fu and Y. Shi, “Secret-to-Image Reversible
2008. Transformation for Generative Steganography,” arX. Prepr.,
[17] M. Barni, F. Bartolini, A. De Rosa and A. Piva, "Capacity of full frame arXiv:2203.06598, 2022.
DCT image watermarks," IEEE Trans. Image Process., vol. 9, no. 8, pp. [42] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, “Return of the
1450-1455, Aug. 2000. devil in the details: Delving deep into convolutional nets,” in Proc. British
[18] M. Ramkumar and A. N. Akansu, "Capacity estimates for data hiding in Machine Vis. Conf., Nottingham, U.K., 2014.
compressed images," IEEE Trans. Image Process., vol. 10, no. 8, pp. 1252- [43] H. Jegou, M. Douze and C. Schmid, “Hamming Embedding and Weak
1263, Aug. 2001. Geometric Consistency for Large Scale Image Search,” in Proc. Eur. Conf.
[19] F. Huang, X. Qu, H. J. Kim and J. Huang, "Reversible Data Hiding in JPEG Comput. Vis., Marseille, France, 2008, pp. 304-317.
Images," IEEE Trans. Circuits Syst. Video Technol., vol. 26, no. 9, pp. [44] J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "ImageNet: A
1610-1621, Sept. 2016. large-scale hierarchical image database," in Proc. IEEE Conf. Comput. Vis.
[20] L. Guo, J. Ni and Y. Q. Shi, "Uniform Embedding for Efficient JPEG Pattern Recognit., Miami, USA,2009, pp. 248-255.
Steganography," IEEE Trans. Inf. Forensics and Secur., vol. 9, no. 5, pp. [45] G. Griffin, AD. Holub and P. Perona, “The Caltech 256,” in Caltech Tech-
814-825, May 2014. nical Report, 2006, Available: http://www.vision.caltech.edu/Image_Dat-
[21] Z. Zhou, H. Sun, R. Harit and X Sun, “Coverless image steganography asets /Caltech256/
without embedding,” in Proc. Int. Conf. Cloud Comput. Secur., Nanjing, [46] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo and Y. Zheng, “Nus-wide: A
China, 2015, pp. 123–132. real-world web image database from national university of Singapore,” in
Proc. ACM Int. Conf. Image Video Retrieval, Santorini, Greece, 2009, pp.
1-9.

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Multimedia. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TMM.2022.3194990

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 13

[47] J. Kodovsky, J. Fridrich and V. Holub, "Ensemble Classifiers for


Steganalysis of Digital Media," IEEE Trans. Inf. Forensics Secur., vol. 7, Q. M. Jonathan Wu (Senior Member,
no. 2, pp. 432-444, Apr. 2012.
[48] J. Fridrich and J. Kodovsky, "Rich Models for Steganalysis of Digital
IEEE) received the Ph.D. degree in
Images," IEEE Trans. Inf. Forensics Secur., vol. 7, no. 3, pp. 868-882, June electrical engineering from the University
2012. of Wales, Swansea, U.K., in 1990.
[49] G. Xu, H. Wu and Y. Shi, "Structural Design of Convolutional Neural In 1995, he joined the National Research
Networks for Steganalysis," IEEE Signal Process. Lett., vol. 23, no. 5, pp.
Council of Canada, Vancouver, BC,
708-712, May 2016.
[50] J. Ye, J. Ni and Y. Yi, "Deep Learning Hierarchical Representations for Canada, where he became a Senior
Image Steganalysis," IEEE Trans. Inf. Forensics Secur., vol. 12, no. 11, Research Officer and a Group Leader. He
pp. 2545-2557, Nov. 2017. is a Professor with the Department of
Electrical and Computer Engineering, University of Windsor,
Windsor, ON, Canada. He holds the Tier 1 Canada Research
Chair in automotive sensors and information systems. He has
Liming Zou received the B.S. degree of authored or coauthored more than 500 peer reviewed articles in
Communication Engineering from computer vision, image processing, intelligent systems,
Shandong Normal University, Jinan, China, robotics, and integrated microsystems. His research interests
in 2017. He is currently working toward the include 3-D computer vision, active video object tracking and
Ph.D. degree of Computer Science and extraction, interactive multimedia, sensor analysis and fusion,
Technology at the School of Information and visual sensor networks.
Science and Engineering, Shandong Dr. Wu is a fellow of the Canadian Academy of Engineering.
Normal University, Jinan, China. He is He is an Associate Editor of the IEEE TRANSACTIONS ON
currently working as a visiting scholar in CYBERNETICS, IEEE TRANSACTIONS ON CIRCUITS AND
the Department of Electrical and Computer Engineering, SYSTEMS FOR VIDEO TECHNOLOGY, Cognitive Computation,
University of Windsor, Windsor, ON, Canada. His research and Neurocomputing. He has served on technical program
interests include multimedia security and coverless information committees and international advisory committees for many
hiding. He is a student member of the CCF. prestigious conferences.

Jing Li received the Ph.D. degree from Jiande Sun (Member, IEEE) received the
Shandong Normal University, Jinan, China, Ph.D. degree in communication and
in 2020. Currently she works with School of information systems from Shandong
Journalism and Communications, Shandong University, Jinan, China, in 2005. He was a
Normal University. Her research interests visiting researcher in Technical University of
include media processing, security and Berlin, University of Konstanz, Carnegie
retrieval. Mellon University, and Yamaguchi
University, and a Postdoctoral Researcher in
Peking University. He is currently a full professor with the
School of Information Science and Engineering, Shandong
Normal University, Jinan. He has authored or coauthored more
than 200 journal and conference papers. His current research
interests include multimedia processing, analysis,
Wenbo Wan received the Ph.D. degree in understanding and the applications in security, retrieval,
information and communication communication, HCI, and so on.
engineering from Shandong University,
Jinan, China, in 2015, supervised by Prof.
J. Liu. From June 2019 to October 2019,
he was a Visiting Researcher with the
Department of Computer Science, City
University of Hong Kong, Hong Kong. He
is currently an Associate Professor with the School of
Information Science and Engineering, Shandong Normal
University, Jinan, China. His research interests include
multimedia security, multimedia quality assessment, and
image/video watermarking.

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on September 20,2022 at 10:58:41 UTC from IEEE Xplore. Restrictions apply.

You might also like