0% found this document useful (0 votes)
0 views

Autoencoder-convolutionalneuralnetwork-based

Autoencoder-convolutional neural network-based embedding and extraction model for image watermarking

Uploaded by

Ayush Fsf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Autoencoder-convolutionalneuralnetwork-based

Autoencoder-convolutional neural network-based embedding and extraction model for image watermarking

Uploaded by

Ayush Fsf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/363686202

Autoencoder-convolutional neural network-based embedding and extraction


model for image watermarking

Article in Journal of Electronic Imaging · September 2022


DOI: 10.1117/1.JEI.32.2.021604

CITATIONS READS
18 1,400

5 authors, including:

Debolina Mahapatra Om Prakash Singh


National Institute of Technology Patna National Institute of Technology Patna
7 PUBLICATIONS 33 CITATIONS 27 PUBLICATIONS 355 CITATIONS

SEE PROFILE SEE PROFILE

Amit Kumar Singh Amrit Agrawal


UP Rajarshi Tandon Open University Allahabad Apollo Institute of Technology, Kanpur
622 PUBLICATIONS 6,368 CITATIONS 20 PUBLICATIONS 160 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Om Prakash Singh on 22 September 2022.

The user has requested enhancement of the downloaded file.


Autoencoder-convolutional neural network-based
embedding and extraction model for
image watermarking

Debolina Mahapatra ,a Preetam Amrit,b Om Prakash Singh ,a


Amit Kumar Singh ,a,* and Amrit Kumar Agrawalc
a
National Institute of Technology Patna, Department of Computer Science and Engineering,
Patna, Bihar, India
b
Loknayak Jai Prakash Institute of Technology, Department of Computer Science
and Engineering, Chapra, Bihar, India
c
Galgotias College of Engineering & Technology, Department of Computer Science
and Engineering, Greater Noida, Uttar Pradesh, India

Abstract. Watermarking consists of embedding in, and later extracting from, a digital cover
a design called a watermark to prove the image’s copyright/ownership. In watermarking, the
use of deep-learning approaches is extremely beneficial due to their strong learning ability with
accurate and superior results. By taking advantage of deep-learning, we designed an autoencoder
convolutional neural network (CNN)-based watermarking algorithm to maximize the robustness
while ensuring the invisibility of the watermark. A two network model, including embedding
and extraction, is introduced to comprehensively analyze the performance of the algorithm. The
embedding network architecture is composed of convolutional autoencoders. Initially, CNN is
considered to obtain the feature maps from the cover and mark images. Subsequently, the feature
maps of the mark and cover are concatenated with the help of the concatenation principle. In the
extraction model, block-level transposed convolution and the rectified linear unit algorithm is
applied on the extracted features of watermarked and cover images to obtain the hidden mark.
Extensive experiments demonstrate that the proposed algorithm has high invisibility and good
robustness against several attacks at a low cost. Further, our proposed scheme outperforms other
state-of-the-art schemes in terms of robustness with good invisibility. © 2022 SPIE and IS&T
[DOI: 10.1117/1.JEI.32.2.021604]
Keywords: digital image watermarking; deep-learning; convolutional neural network; auto-
encoders; deep neural networks.
Paper 220607SS received Jun. 16, 2022; accepted for publication Aug. 30, 2022; published
online Sep. 19, 2022.

1 Introduction
Deep-learning technology is constantly evolving and becoming more popular among intelligent
multimedia data processing services. However, the alteration and unauthorized distribution of
multimedia content has become easier, especially regarding images, and the major issue of
copyright violation and ownership conflicts has attracted the attention of numerous research
practitioners.1 Therefore, the protection of the intellectual property rights of these images is
crucial. There are many existing studies on image protection. Many researchers have developed
an efficient watermarking scheme used in versatile applications to resolve the security and
ownership conflict issues of media data.2 The applications include hospitals, surveillance, gov-
ernment, ecommerce, academics, crime prevention, etc. This scheme embeds secret information
called a watermark that is visually invisible in a media object and identifies its ownership by
extracting the mark.1,2 The primary aim of a watermarking scheme is to enhance the three fea-
tures of invisibility, capacity, and robustness to maintain a good relationship between them.1
Generally, watermarking methods are applied to alter pixel values of an image in the spatial

*Address all correspondence to Amit Kumar Singh, amit.singh@nitp.ac.in

1017-9909/2022/$28.00 © 2022 SPIE and IS&T

Journal of Electronic Imaging 021604-1 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Fig. 1 A traditional image watermarking model; I C : cover image, I w : watermark image, M: marked
image, M 0 : received marked image after transmission through a channel, and I w 0 : extracted water-
mark image.

domain or alter the transform coefficients in the transformed domain.3 Compared with the trans-
formed domain scheme, spatial domain schemes lack robustness and flexibility. Therefore, many
researchers have proposed digital watermarks in the transformed domain to protect the images
for various potential applications. A schematic diagram of the traditional image watermarking
system in the transformed domain is shown in Fig. 1.
In watermarking, the use of convolutional neural network (CNN)-based approaches instead
of transformed-based approaches is extremely beneficial due to their strong learning ability and
more accurate and superior results.4–9 Figure 2 shows a block diagram for our proposed deep-
learning-based watermarking scheme.
In most recent works,6,10 models inspired by complex deep neural network (DNN) architec-
tures have been employed in image watermarking. Although these methods may be able to
provide better visual quality of watermarked images, they are computationally expensive and
require very long training times.
In this paper, we present an autoencoder CNN-based watermarking algorithm to maximize
the robustness while ensuring the invisibility of the watermark. A two network model, including
embedding and extraction, is introduced to comprehensively analyze the performance of the
algorithm. Extensive experiments demonstrate that the proposed algorithm has high invisibility

Fig. 2 Deep-learning-based image watermarking scheme using the autoencoder functionality.

Journal of Electronic Imaging 021604-2 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

and good robustness against several attacks. Further, our proposed scheme outperforms other
state-of-the-art schemes in terms of robustness with good invisibility.
The remainder of this paper is organized as follows. The related literature is described in
Sec. 2. The proposed image watermarking scheme is presented in Sec. 3. Results and analysis
are given in Sec. 4. Finally, in Sec. 5, we provide some concluding remarks.

2 Literature Survey
Deep-learning models are capable of finding hidden patterns within the data, making them very
successful in image-based applications. Recent years have witnessed the application of DNNs
in image watermarking, which is capable of obtaining a good visual quality as well as providing
robustness against commonly known attacks. In this section, we provide a summary of some
recent works in this area. Initial works focused more on enhancing the quality of the water-
marked image with the features extracted from deep-learning models. Kandi et al.11 proposed
a secure non-blind image watermarking mechanism based on CNNs that outperformed existing
state-of-the-art transform-domain techniques. This scheme has a good level of invisibility and
a security approach, and its robustness was proved against certain basic image attacks. Vukotić
et al.4 proved the utility of feature extraction to yield good imperceptibility to watermarked
images in zero-bit watermarking (or a watermark detection strategy) using the hyper-cone detec-
tor and the concept of adversarial images.
Fierro-Radilla et al.5 used CNNs to extract features from the cover image and combined
these features with the watermark data using XOR operations that make their zero-watermarking
scheme faster. Han et al.6 designed another zero-watermarking algorithm based on the pre-
trained network Visual Geometry Group (VGG)-19. This scheme addressed the issues related
to medical image security and is robust to geometric attacks. Latent feature maps from the medi-
cal images were extracted using VGG-19, and two-dimensional discrete Fourier transform was
used for watermark generation. In addition to this, hashing enhanced the robustness of the sys-
tem and with the Hermite chaotic neural network provided additional security. The watermarking
scheme in Ref. 7 is a CNN-based simple that is lightweight and uses an iterative learning frame-
work for image watermarking. The training process is composed of a loop consisting of the three
stages, viz., embedding the watermark, followed by the attack simulation, and finally the weight
update. Attack simulation allows the network to adaptively capture invariant features for various
attacks. The weights of the network were updated repeatedly to extract the watermark from the
given image. Chen et al.8 in their work on JSNet used a simulation network to enhance the
robustness against JPEG lossy compression attacks. This attack simulation was implemented
in a robust end-to-end CNN-based watermarking scheme, which achieved a considerable
performance improvement. HiDDeN12 is one of the remarkable works in this domain. It is a
framework that can be trained in an end-to-end manner and used for both data hiding and water-
marking. Based on CNNs, the cover image and the message to be encoded are first passed
through an encoder network and processed at the decoder network to extract the message.
The adversary network is used to compute the loss and to confirm whether a given image con-
tains any hidden data. Another work inspired by this framework is ROMark9 in which adversarial
learning and min-max optimizations were applied to provide robustness to a CNN-based
watermarking model. ReDMark13 is an end-to-end framework that simulates a discrete cosine
transform in their deep network architecture using convolutional autoencoders. It exhibits an
enhanced robustness against JPEG attacks and outperforms the HiDDeN framework. Zhong
et al.14 used an architecture based on CNNs to adaptively learn the rules of embedding a water-
mark in an image. In this model, the encoder and decoder networks perform the stages of
encoding, embedding and decoding the watermark, and using an extractor to retrieve the original
watermark image by reversing the processes. Additionally, to provide robustness to this method,
an invariance layer composed of a DNN is applied to handle the distortions in the watermarked
image. A practical application of this scheme was demonstrated in Ref. 15 that enables secure
and authorized internet of things device onboarding by embedding user credentials into images,
such as using printed QR codes on these devices. The marked images obtained using the scheme
can tolerate distortions up to 85% to 90%, thereby, proving its robustness.

Journal of Electronic Imaging 021604-3 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

A common issue in most of the above discussed schemes was maintaining the robustness due
to the fragile nature of deep-learning architectures. On encountering any changes made to the
marked image, it became difficult for the extractor network to obtain the watermark correctly.
For this purpose, several of the works involve attack simulation by training networks iteratively.
However, it only adds to an increased complexity of the scheme and a higher training time.
We address this issue in our work in the subsequent sections.

3 Proposed Method
A two network model, including embedding and extraction, is introduced to comprehensively
analyze the performance of the algorithm. The entire proposed mechanism is expressed as fol-
lows: (a) autoencoder-based CNN embedding and (b) extraction network model. The simplified
embedding and extraction network models are shown in Figs. 3 and 4, respectively. Further, the
detailed steps in both network models are presented in Algorithm 1 and Algorithm 2, respec-
tively. The embedder network configuration details are given in Table 1. Hyperparameters used
in the embedder and extraction networks are given in Table 2. Further, extraction network

Fig. 3 Embedder network architecture.

Fig 4 Extractor network architecture.

Journal of Electronic Imaging 021604-4 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Algorithm 1 Watermark embedding.

Input: Training samples consisting of cover image I_c and watermark image I_w

Output: Watermarked image У

1.1. Initialize:

S ← 32

ɳ ← 0.001

e ← 50

Cover_SIZE = (128 × 128 × 1)

Watermark_SIZE = (64 × 64 × 1)

2. Reading data:

Load Dataset_Ic

Load Dataset_Iw

3. Pre-process image data:

Δ (Grayscale (Dataset_Ic), Cover_SIZE)

Δ (Grayscale (Dataset_Iw), Watermark_SIZE)

4. Make feature maps:

Select no. of kernels and kernel size for each layer, α & β

5. Make encoder for cover image & extract features:

fc ← Ψ (Dataset_Ic, α, β)

6. Make encoder for watermark image & extract features:

fw ← Ψ (Dataset_Iw, α, β)

7. Concatenate features:

fcw ← Concat(fc, fw)

8. Make model decoder on concatenated features:

ω (fcw, α, β)

9. Compile model:

Load optimizer → Adam (ɳ)

Ɱ ← compile (Ψ, ω, A, M)

10. Train and test models:

Training:

for 0 to e do

for 0 to S do

Step 1: Input images in model:

Ω ← Ɱ(Dataset_Ic, Dataset_Iw)

Step 2: Calculate loss:

l ← M(Dataset_Ic, Ω)

Journal of Electronic Imaging 021604-5 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Algorithm 1 (Continued).

Step 3: Apply Adam optimizer:

11. Calculate gradients:

Ȣ ← A (l, α, β)

12. Apply gradients on model (update weights):

α, β ← A (Ȣ, α, β)

end

end

13. Testing:

У ← Ɱ (Dataset_Test_Ic, Dataset_Test_Iw)

14. Calculate:

PSNR (Dataset_Ic, У)

SSIM (Dataset_Ic, У)

Algorithm 2 Watermark extraction.

Input: Watermarked images У, cover images I_c, and watermarks Iw

Output: Extracted watermarks Ew

1. Reading data:

Load Dataset_ У

Load Dataset_Ic

Preprocess image data:

Δ (Grayscale (Dataset_ У), Cover_SIZE)

Δ (Grayscale (Dataset_Ic), Cover_SIZE)

2. Make encoder for watermarked image & extract features:

fy ← Ψ (Dataset_У, α, β)

Make encoder for cover image & extract features:

fc ← Ψ (Dataset_Ic, α, β)

3. Subtract cover image features from the watermarked image features:

fw′ = fy-fc

4. features = empty_list( )

for j from 1 to 4:

y j ¼ φðf w 0 Þ y j ¼ bat chnor mðy j Þ y j ¼ r eluðy j Þ

Y = features.append(yj)

end

Y = reshape(Y) to 16 × 16 × 16

Apply decoder: σ (Y, α, β)

Journal of Electronic Imaging 021604-6 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Algorithm 2 (Continued).

Compile model Ɛ (φ, σ, A, M)

5. Training:

for 0 to e do

for 0 to S do

Step 1: Input images in model:

Ω ← Ɛ (Dataset_Ic, Dataset_ У)

Step 2: Calculate loss:

l ← M(Dataset_Iw, Ω)

Step 3: Apply Adam optimizer: Calculate gradients:

Ȣ ← A (l, α, β)

Step 4: Apply gradients on model (update weights):

α, β ← A (Ȣ, α, β)

End

End

Testing:

Ew ← Ɛ (Dataset_Ic, Dataset_У)

Table 1 The embedder network configuration.

Layer Configuration Shape Layer Configuration Shape

Input 1 128 × 128 × 1 (128, 128, 1) Conv 2-1 32 (3 × 3) (32, 32, 32)
grayscale images convolutions; stride = 2

Input 2 64 × 64 × 1 (64, 64, 1) Concatenate Feature map (32, 32, 64)


grayscale images concatenation

Conv 1-1 64 (3 × 3) (64, 64, 64) Conv transpose 64 (3 × 3) (64, 64, 64)
convolutions; stride = 2 convolutions; stride = 2

Conv 1-2 32 (3 × 3) (32, 32, 32) Output 1 (3 × 3) convolution; (128, 128, 1)


convolutions; stride = 2 stride = 2

Table 2 Hyperparameters used in the embedder and extractor


networks.

Hyperparameters Embedder Extractor

Optimizer Adam Adam

Learning rate 0.001 0.0001

Beta 1 0.9 0.9

Beta 2 0.999 0.999

Loss Mean squared error Mean squared error

Epochs 50 300

Journal of Electronic Imaging 021604-7 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Table 3 The extractor network configuration.

Layer Configuration Shape Layer Configuration Shape

Input 1 128 × 128 × 1 (128, 128, 1) Concatenate 4096 nodes (4096,)


grayscale images

Input 2 128 × 128 × 1 (128, 128, 1) Dense block D 1 4096 nodes (4096,)
grayscale images

Subtract To subtract the (128, 128, 1) Dense block D 2 4096 nodes (4096,)
input images

Flatten To flatten the (16384) Reshape Reshape to (16, 16, 16)


features (16, 16, 16)

Dense block B 1 1024 nodes (1024,) Conv transpose 64 (3 × 3) (32, 32, 64)
convolutions;
stride = 1

Dense block B 2 1024 nodes (1024) Conv transpose 128 (3 × 3) (32, 32, 128)
convolutions;
stride = 1

Dense block B 3 1024 nodes (1024) Conv transpose 128 (3 × 3) (64, 64, 128)
convolutions;
stride = 2

Dense block B 4 1024 nodes (1024) Output 1 (3 × 3) (64, 64, 1)


convolution;
stride = 1

Table 4 Notation summary.

Notation Description Notation Description

Dataset_Ic ½c1; c2; : : : ; cn is the set of cover images l Calculated loss
for training the model

Dataset_Iw ½w 1; w 2; : : : ; wn is the set of watermark Δ Resize function


images for training the model

S Number of batches Ȣ Calculated gradients

ɳ Learning rate У Output watermarked images


obtained from model

e Number of epochs Dataset_У Set of marked images obtained


from embedder

H Resize function φ DNN blocks defined within the


extractor model

α Number of kernels σ Decoder used in the extractor

β Size of kernel Ɛ Extractor model

Ψ Encoder layers Ew Extracted watermark images

Concat Function to concatenate feature maps M Mean square error function

ω Decoder layers A Adam optimizer function

Ɱ Embedder model PSNR Peak signal to noise ratio function

Ω Model outputs during training process SSIM Structural similarity index measure
function

Journal of Electronic Imaging 021604-8 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

configuration details are given in Table 3. Some commonly used notations in the algorithms are
listed in Table 4.

3.1 Embedder Network


The embedder network architecture is composed of convolutional autoencoders. Initially, using
an encoder composed of convolutional layers, we extract latent representation from the cover
image “I_c” to concatenate equal-sized feature maps. Further, up-sampling on the original
watermark image “Iw” is performed. Feature maps are also obtained from the watermark image
using a series of two conv-layers. Thereafter, the cover image feature maps and the watermark
feature maps are concatenated. As for the inverse transform, we apply transpose convolution
within the decoder to obtain the watermarked image.

3.2 Extractor Network


In our proposed model, a DNN-based architecture is used to extract the mark data and
transpose CNN layers. This process is used to reconstruct the original mark image. The
model takes cover I_c and the marked image as inputs. Initially, we subtract the original
cover image I_c and the marked image “M” to capture the highlighted difference features
between them.
Thereafter, these features are flattened into a feature vector. The extractor “Ɛ” model consists
of a dense network of fully connected layers and convolution layers to capture the features. After
this, these features are concatenated and passed through two sequential dense network blocks to
extract the invariant features of the hidden mark. Finally, the conv-transpose network is used to
reconstruct the original watermark image.

4 Experiments and Analysis


In this section, we perform the experimental evaluation of the proposed image watermarking
scheme. In Sec. 4.1, we state the evaluation metrics briefly. We present our data preparation
for both training and testing in Sec. 4.2, and finally in Sec. 4.3, we show the results for imper-
ceptibility and robustness of the scheme.

4.1 Data Preparation


The proposed deep-learning-based image watermarking model was trained and tested on the
2326 Kaggle cats and dogs dataset16 with a cover image of size 128 × 128. The experiments
were performed on Python. For watermarking, a set of 64 popularly used random images of
size 64 × 64 were taken and distributed evenly to be applied to the cover images. Cover data
were transformed into grayscale images, and for watermarking, grayscale images were used.
After rectifying the images, there are 1629 images and 697 images used for training and testing
purposes, respectively. Some samples of cover images, watermark images, and marked images
are shown in Fig. 5.

4.2 Evaluation Metric


The similarity evaluation of the two images is computed by peak signal to noise ratio (PSNR)
and structural similarity index measure (SSIM), and the normalized correlation (NC) score is
commonly used to evaluate the robustness performance against attacks.17 Some notable metrics
are described in Table 5.

4.2.1 Performance analysis


The invisibility and robustness performance of our proposed algorithm is given in Table 6.
The PSNR, SSIM, and NC values show the high quality with good robustness performances.

Journal of Electronic Imaging 021604-9 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Fig. 5 Test images as (a) cover; (b) mark; and (c) marked.

The embedding time during model loading is 9.6 s, and the extraction and cleaner time are 29.4
and 3.19 s, respectively. Figure 6 indicates that training and validation losses decrease with an
increase in training steps (epochs). As the gap between training losses and validation losses is
very small, this depicts that the model does not overfit. In Fig. 7, we present some samples of
the extracted watermarks and its corresponding original watermark.
The performance comparison of our proposed model was carried out to test the robustness of
the proposed watermarking scheme. Table 7 contains the NC values against different attacks. It is
obvious that the NC score is over 0.7, which proves that the proposed algorithm is robust to
considered attacks. Additionally, in Fig. 8, we plotted the robustness performance of the pro-
posed algorithm against different attacks.

4.2.2 Comparative analysis


A comparative analysis of the proposed algorithm with some recent schemes is carried out in
this section. The invisibility performance of the algorithm is compared for the Canadian Institute

Journal of Electronic Imaging 021604-10 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Table 5 Different evaluation metrics.

Major
S. No. characteristics Description How to measure it
2
1 Invisibility Measure the visual similarity PSNR ¼ 10 log10 ðsize of
MSE
imageÞ
.
between the plain and
marked images.17 Mean square error
PM PN
1 2
ðM MSE Þ ¼ M×N p¼1 q¼1 ðH pq − I pq Þ .

“H pq ” and “I pq ” are the pixel value of cover and


marked image of size “M × N,” respectively.

Acceptable PSNR score ≥28 dB

SSIM ¼ f ðpði; jÞqði; jÞr ði; jÞÞ, where

pði; jÞ ¼ μ2μ m μn þ∁1 2σ m σ n þ∁2


2 þ μ2 þ∁ , qði; jÞ ¼ σ 2 þ σ 2 þ∁ ,
m n 1 m n 2

r ði; jÞ ¼ σσmmn þ∁3


σ n þ∁3 where,

pði; jÞ, qði; jÞ, and r ði; jÞ are luminance, contrast,


and structure comparison functions, respectively

Range of SSIM score is [0,1]


PX PY
ðW orgij ×W extij Þ
2 Robustness Measures the capabilities NC ¼ Pi¼1
X
P
j¼1
Y 2
i¼1 j¼1
ðW orgij Þ
of hidden data to be
resistant to any attack.17 Where, W orgij and W extij are the pixel of original
and extracted watermark of image
size X × Y , respectively

Acceptable NC score ≥ 0.75

3 Time cost The cost associated with Embedding time and extraction time
evaluation embedding and extracting
digital watermark from
the cover media

Table 6 Evaluation results on test data.

Evaluation metric Proposed scheme

PSNR 31.34 dB

SSIM 0.9940

NC 0.9937

Fig 6 Training and validation loss curves for (a) embedder network and (b) extractor network.

Journal of Electronic Imaging 021604-11 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Fig. 7 Extracted watermarks and the corresponding original watermark images.

Table 7 NC value against different noise attacks.

JPEG
Salt and pepper Speckle Gaussian Filtering attacks compression Rotation

Intensity NC Variance NC Variance NC Dimension NC QF NC Angle NC

0.002 0.9893 0.001 0.9918 0.001 0.9866 (5 × 5) 0.9648 90 0.8349 1 0.8257

0.005 0.9837 0.003 0.9881 0.003 0.9687 (3 × 3) 0.9431 80 0.8312 2 0.8097

0.008 0.9778 0.005 0.9836 0.005 0.9541 (5 × 5) 0.9199 70 0.8300 3 0.7751

0.01 0.9694 0.01 0.9727 0.01 0.9356 (3 × 3) 0.9200 60 0.8287 4 0.7479

0.03 0.9331 0.03 0.9452 0.03 0.9212 (5 × 5) 0.9514 50 0.8272 5 0.7234

0.05 0.9218 0.05 0.9333 0.05 0.9144 (3 × 3) 0.9726 40 0.8211 6 0.7114

For Advanced Research-10 (CIFAR-10)18 and Modified National Institute of Standards and
Technology (MNIST)19 datasets in terms of PSNR and SSIM, and the results are shown in
Table 8. For this evaluation, a subset of 8000 images was randomly chosen from the selected
datasets. Here, 6000 images and 2000 images are selected for training and testing purposes,
respectively. The PSNR and SSIM scores of the proposed algorithm and those from Rahim and
Nadeem20 and Ding et al.21 are given in Table 8. The SSIM of the proposed algorithm is higher
than these,20,21 but PSNR is lower than both schemes. Because our PSNR performance is lower,
our values are a good indicator of the invisibility of the proposed algorithm. Further, the NC
score of the proposed algorithm and from Ding et al.21 is given in Table 9. Here, median filter,
JPEG, and rotation attack are added in the cover image containing the secret obtained by our
algorithm to test the robustness. Table 9 shows that the NC score obtained by our algorithm is
bigger and closer to 1, which shows better robustness than the Ding et al.21 scheme. Additionally,
the detail comparisons of our algorithm with this scheme21 are listed in Table 10.

5 Conclusion
This article presented a CNN-based watermarking technique for the prevention of the infringe-
ment of digital images. A two network model, including embedding and extraction, is introduced

Journal of Electronic Imaging 021604-12 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Fig. 8 Test for different attacks: (a) salt and pepper noise, (b) speckle, (c) Gaussian, (d) rotation,
and (e) JPEG.

Table 8 Comparison of PSNR and SSIM values.

Dataset Proposed scheme

Watermark PSNR PSNR


Methods Cover image image (in dB) SSIM (in dB) SSIM

Rahim and CIFAR10 MNIST 32.9 0.87 32.23 0.9900


Nadeem20
CIFAR10 30.9 0.98 32.35 0.9900
21
Ding et al. Kaggle Random images 38 0.99 32.29 0.9914

to comprehensively analyze the performance of the algorithm. The embedding network archi-
tecture is composed of convolutional autoencoders. Initially, CNN is considered to obtain the
feature maps from the cover and mark images. Subsequently, the feature map of the mark and
cover is concatenated with the help of the concatenation principle. In the extraction model,

Journal of Electronic Imaging 021604-13 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

Table 9 Comparison of robustness.

NC values (%)

Attacks Ding et al.21 Proposed algorithm

Median filter 3 × 3 10.29 98.77

JPEG QF = 95 7.07 95.01

Rotation (45) 14.96 38.95

Table 10 Comparison analysis of proposed algorithm with state-of-the-art schemes.21

Parameters Ding et al.21 Proposed Parameters Ding et al.21 Proposed

Number of Single mark Set of 64 marks Extractor Convolution DNN blocks,


watermarks architecture layers followed by
transposed
convolution layers
(Tr-CNNs)

Number of 1703 images 1629 images Extraction Using a CNN to Using DNN blocks
cover images (training); 427 (training); 697 process reconstruct the to capture
images (testing) images (testing) mark image; invariant features
being sensitive from the marked
to noise, it affects image and
the robustness reconstructing
of the scheme. using Tr-CNNs

Image 320 × 240 128 × 128 (cover); Image quality Low quality mark Good quality mark
dimensions 64 × 64 (mark)

Embedder Transposed Convolution Robustness NC = 70.34% NC = 99.37%


architecture convolutions and layers for without attacks
convolution layers encoder;
are used for up transpose
and down sampler, convolutions for
respectively decoder

Embedding Up and down Concatenating Additional Costly due to No additional


process sampling the equal-sized computations blending operations
feature maps of operation
cover and mark
images

block-level transposed convolution and a rectified linear unit algorithm is applied on the
extracted features of the watermarked and cover images to obtain the hidden mark. The exper-
imental analysis demonstrates that our proposed technique maintains a satisfactory marked
image quality and resistance to the effective attacks. Future work will investigate the perfor-
mance with color images, improve the robustness and security performance, and identify poten-
tial threats for further analysis.

References
1. A. K. Singh, “Data hiding: current trends, innovation and potential challenges,” ACM Trans.
Multimedia Comput. Commun. Appl. 16(3s), 1–16 (2021).
2. O. P. Singh et al., “SecDH: security of COVID-19 images based on data hiding with PCA,”
Comput. Commun. 191, 367–377 (2022).
3. O. P. Singh et al., “Image watermarking using soft computing techniques: a comprehensive
survey,” Multimedia Tools Appl. 80(20), 30367–30398 (2020).
4. V. Vukotić, V. Chappelier, and T. Furon, “Are deep neural networks good for blind image
watermarking?” in IEEE Int. Workshop on Inf. Forensics and Security, December, IEEE,
pp. 1–7 (2018).
5. A. Fierro-Radilla et al., “A robust image zero-watermarking using convolutional neural
networks,” in 7th Int. Workshop on Biometrics and Forensics, pp. 1–5 (2019).

Journal of Electronic Imaging 021604-14 Mar∕Apr 2023 • Vol. 32(2)


Mahapatra et al.: Autoencoder-convolutional neural network-based embedding and extraction model. . .

6. B. Han et al., “Zero-watermarking algorithm for medical image based on VGG19 deep
convolution neural network,” J. Healthcare Eng. 2021, 12 (2021).
7. S. M. Mun et al., “A robust blind watermarking using convolutional neural network,”
arXiv-1704 (2018).
8. B. Chen et al., “JSNet: a simulation network of JPEG lossy compression and restoration
for robust image watermarking against JPEG attack,” Comput. Vis. Image Underst. 197,
103015 (2020).
9. B. Wen and S. Aydore, “ROMark: a robust watermarking system using adversarial training,”
arXiv:1910.01221 (2019).
10. M. Bagheri et al., “Image watermarking with region of interest determination using deep
neural networks,” in 19th IEEE Int. Conf. Mach. Learn. and Appl., pp. 1067–1072 (2020).
11. H. Kandi, D. Mishra, and S. R. S. Gorthi, “Exploring the learning capabilities of convolu-
tional neural networks for robust image watermarking,” Comput. Security 65, 247–268
(2017).
12. J. Zhu et al., “Hidden: hiding data with deep networks,” Lect. Notes Comput. Sci. 11219,
682–697 (2018).
13. M. Ahmadi et al., “ReDMark: framework for residual diffusion watermarking based on deep
networks,” Expert Syst. Appl. 146, 113157 (2020).
14. X. Zhong et al., “An automated and robust image watermarking scheme based on deep
neural networks,” IEEE Trans. Multimedia 23, 1951–1961 (2021).
15. S. Mastorakis et al., “DLWIoT: deep learning-based watermarking for authorized IoT
onboarding,” in IEEE 18th Annu. Consumer Commun. & Netw. Conf., IEEE, pp. 1–7
(2021).
16. “Kaggle Dogs vs. Cats,” https://www.kaggle.com/c/dogs-vs-cats (accessed 13 September
2022).
17. O. P. Singh and A. K. Singh, “Data hiding in encryption–compression domain,” Complex
Intell. Syst. 1–14 (2021).
18. A. Krizhevsky and G. Hinton, Learning Multiple Layers of Features from Tiny Images,
Technical Report, University of Toronto (2009).
19. Y. LeCun et al., “Gradient-based learning applied to document recognition,” Proc. IEEE
86(11), 2278–2324 (1998).
20. R. Rahim and S. Nadeem, “End-to-end trained CNN encoder-decoder networks for image
steganography,” Lect. Notes Comput. Sci. 11132, 723–729 (2019).
21. W. Ding et al., “A generalized deep neural network approach for digital watermarking analy-
sis,” IEEE Trans. Emerg. Top. Comput. Intell. 6(3), 613–627 (2022).

Debolina Mahapatra is currently pursuing her MTech degree in computer science and engi-
neering at National Institute of Technology (NIT), Patna, Bihar, India. Her research interests
include data hiding techniques and cryptography.

Preetam Amrit is currently pursuing his PhD in computer science and engineering at NIT Patna,
Bihar, India. His research interests include multimedia hiding techniques and deep learning
methodology.

Om Prakash Singh is currently working as a temporary faculty in NIT, Patna, Bihar, India.
He pursued his PhD at NIT Patna, Bihar, India. His research interests include data hiding
techniques and cryptography.

Amit Kumar Singh is an assistant professor in the Computer Science and Engineering
Department at NIT, Bihar, India. His research interests include watermarking and image
processing.

Amrit Kumar Agrawal is an assistant professor in the Computer Science and Engineering
Department at the Galgotias College of Engineering & Technology, Greater Noida, Uttar
Pradesh, India. His research interests include security, computer vision, and biometrics.

Journal of Electronic Imaging 021604-15 Mar∕Apr 2023 • Vol. 32(2)

View publication stats

You might also like