0% found this document useful (0 votes)
9 views13 pages

Backdoor Attack On Deep Learning-Based Medical Image Encryption and Decryption Network

This article discusses a novel backdoor attack paradigm targeting deep learning-based medical image encryption and decryption networks, highlighting vulnerabilities in these systems. The proposed attack methods involve using a backdoor discriminator for the encryption model and parameter replacement for the decryption model, effectively compromising their security performance. Experimental results demonstrate the effectiveness of the attacks, emphasizing the need for improved security measures in deep learning encryption techniques.

Uploaded by

Long Long
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views13 pages

Backdoor Attack On Deep Learning-Based Medical Image Encryption and Decryption Network

This article discusses a novel backdoor attack paradigm targeting deep learning-based medical image encryption and decryption networks, highlighting vulnerabilities in these systems. The proposed attack methods involve using a backdoor discriminator for the encryption model and parameter replacement for the decryption model, effectively compromising their security performance. Experimental results demonstrate the effectiveness of the attacks, emphasizing the need for improved security measures in deep learning encryption techniques.

Uploaded by

Long Long
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

This article has been accepted for publication in IEEE Transactions on Information Forensics and Security.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 1

Backdoor Attack on Deep Learning-based Medical


Image Encryption and Decryption Network
Yi Ding, Member, IEEE, Zi Wang, Zhen Qin, Member, IEEE, Erqiang Zhou, Guobin Zhu,
Zhiguang Qin, Member, IEEE,Kim-Kwang Raymond Choo, Senior Member, IEEE,

Abstract—Medical images often contain sensitive information, I. I NTRODUCTION


and one typical security measure is to encrypt medical images
prior to storage and analysis. A number of solutions, such as INCE medical images (containing sensitive patient in-
those utilizing deep learning, have been proposed for medical
image encryption and decryption. However, our research shows
S formation), security measures such as those proposed
in the literature [21], [31] are adopted to support medical
that deep learning-based encryption models can potentially be
vulnerable to backdoor attacks. In this paper, a backdoor attack image encryption. However, it can be challenging for these
paradigm for encryption and decryption network is proposed and security measures to achieve an optimal balance between
corresponding attacks are respectively designed for encryption security performance and encryption efficiency, and a number
and decryption scenarios. For attacking the encryption model, of researchers have presented different solutions to varying
a backdoor discriminator is adopted, which is randomly trained
with the normal discriminator to confuse the encryption process. successes. For example, Ding et al. [8] proposed a deep
In the decryption scenario, a number of subnetwork parameters learning-based encryption and decryption network DeepEDN.
are replaced and the subnetwork can be activated when detecting The approach (one of the first works to utilize deep learning
the trigger embedded into the input (encrypted image) to degrade techniques in medical image encryption) adopts the Cycle-
the decryption performance. Considering the model performance GAN network as the main learning network, and maps medical
degradation due to parameter replacement, the model pruning
is also adopted to further strengthen the attacking performance. images from the original domain to the target domain. The
Furthermore, the image steganography is adopted to generate target domain is used as a hidden factor to guide the learning
invisible triggers for each image; subsequently, improving the model in the encryption process. The proposed method has a
stealthiness of backdoor attacks. Our research on designing large key space, supports one-time encryption, and is sensitive
backdoor attacks for encryption and decryption network can to key changes.
serve as an attacking mode for such networks, and provides
another research direction for improving the security of such While deep neural network-based medical image encryption
models. This research is also one of the earliest works to approaches may be effective, a number of attack methods
realize the backdoor attack on the deep learning based medical against such approaches have also been identified [10], [22],
encryption and decryption network to evaluate the security [25]. One example is the backdoor attack that can occur
performance of these networks. Extensive experimental results
during the training process of learning networks. Specifically,
show that the proposed method can effectively threaten the
security performance both for the encryption and decryption the attacker injects a backdoor into a model so that the
network. input stamped with a specific backdoor trigger would exploit
a pre-designed model behavior by following the attacker’s
Index Terms—Encryption and Decryption Network, Backdoor
Attack, Paradigm. expectation. To ensure that the attacked model still implements
a normal function in the absence of a trigger, one can use
data poisoning. The latter is a frequently used method to
*Corresponding author: Zhen Qin.(e-mail:[email protected])
This work was supported in part by the National Natural Science Foun-
tamper with model parameters by injecting incorrectly labeled
dation of China (No.62076054, No.62072074, No.62027827, No.62002047), data into the training set so that the attacked model would
the Sichuan Science and Technology Innovation Platform and Talent Plan behave normally on clean samples, whereas its prediction can
(No.2020JDJQ0020, No.2022JDJQ0039), the Sichuan Science and Technol-
ogy Support Plan (No.2022YFQ0045, No.2022YFS0220, No.2021YFG0131,
be changed to the target label when the trigger is activated.
No. 2023YFS0020, No. 2023YFS0197No. 2023YFG0148), the Medico- However, this backdoor attack method is mainly based on
Engineering Cooperation Funds from University of Electronic Science and a supervised model and can be difficult to apply in semi-
Technology of China (No.ZYGX2021YGLH212, No.ZYGX2022YGRH012).
(Corresponding author: Zhen Qin.)
supervised and unsupervised generative models.
Yi Ding is with the Network and Data Security Key Laboratory of Sichuan Another kind of backdoor attack method is the Neural
Province, School of Information and Software Engineering, University of Trojan [18], which is a targeted attack method to manipulate
Electronic Science and Technology of China, Chengdu 610054, China(e-mail:
[email protected]). both the model parameters and the inputs so as to directly
Zi Wang, Zhen Qin, Erqiang Zhou, Guobin Zhu, Zhiguang Qin change the parameters and calculation paradigm in the net-
are with the Network and Data Security Key Laboratory of Sichuan work. However, such an method is also mainly applied on
Province, School of Information and Software Engineering, University
of Electronic Science and Technology of China, Chengdu 610054, supervised models. There are relatively few research efforts
China (e-mail: [email protected], [email protected], focused on attacking the generative models, and the majority
[email protected], [email protected], [email protected] of the attacks on GAN-like generative networks focus on
Kim-Kwang Raymond Choo is with the Department of Information Systems
and Cyber Security, University of Texas at San Antonio, San Antonio, TX member inference attacks. In these attacks, the attacker seeks
78249-0631, USA (e-mail: [email protected]). to infer the dataset used for training the same GAN net-

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 2

work by calculating the performance of the generative model earliest work to realize the backdoor attack on deep
against different data. However, based on the experiments in learning based encryption and decryption models.
DeepEDN, it can be found that even if the attacker obtains 2) Two backdoor attack methods against the encryption
relevant knowledge of the network model and training data, and decryption model are designed respectively. One
the ciphertext fields generated by the encryption model for is to train the encryption generator by using a random
different training is still quite different. In other words, it backdoor discriminator to make the encryption generator
can be impractical for attackers to reconstruct the model generate the original image. The other is to replace part
and existing member inference attacks for GAN networks of the parameters with a subnet and launch backdoor at-
are difficult to be directly adopted to attack deep learning- tack through activating the inserted network. Moreover,
based encryption and decryption networks. We also observe the model pruning method is also applied to parameter
that most backdoor attack methods still lack the concealment replacement in backdoor attacks, which can minimize
ability, and/or require the use of backdoor trigger marks that the performance degradation when the subnet keeps
are quite different from the original background in the image. silence. Furthermore, the decryption performance can be
Such attack marks can be easily recognized and removed by greatly destroyed when the subnet is activated.
defending networks [6]. 3) Extensive experiments are conducted on the chest-Xray
In order to realize the backdoor attack on the deep learning- dataset to evaluate the proposed paradigm. The results
based encryption and decryption network for medical image, show that our method can effectively destroy the per-
this paper proposes a backdoor attack paradigm against the formance of encryption and decryption networks and is
generative encryption and decryption network. Specifically, also with higher concealment.
this paradigm utilizes image steganography to generate back-
II. R ELATED W ORK
door triggers and designs different attack schemes for encryp-
tion network and decryption networks respectively, which can A. Deep Learning-based Medical Image Encryption and De-
break the security of the target models. To be more specific, cryption Network
if the encryption network is regarded as the target of the In the past decade, many deep-learning-based image encryp-
attacker, an independent backdoor discriminator is adopted tion algorithms have been proposed to meet security require-
to train the encryption network to embed in the backdoor ments. In order to extract the features from the iris image, Li
trigger. When inputting the samples with trigger, the backdoor et al. [15] trained a CNN using the CASIA iris database. They
encryption network generates “cipher” images similar to the then used the RS error correcting code to encode the feature
original ones, which can destruct the encryption performance. vector and determined the encryption key that was used to en-
When the decryption network is regarded as the attack target, crypt the plaintext image by the XOR operation. Maniyath and
the subnet replacement is adopted to insert the backdoor Thanikaiselvan [20] suggested a powerful deep neural network
into the decryption network. Moreover, by replacing part of that created a secret key resistant to multiple attacks and used a
the channels of the decryption network, an elaborate subnet chaotic map to encrypt the image without compromising image
is embedded into the origin network architecture, which is quality. In order to provide sensitive keys and initial values and
activated when inputting samples with triggers. The backdoor regulated parameters for the hyperchaotic log-map, Erkan et
decryption network can decrypt ciphertext images without trig- al. [9] automated a CNN trained on the ImageNet database. As
ger into the clean samples. While it will generate unrecognized a result, they were able to produce a varied chaotic sequence
images, which is totally different from the original one, to for picture encryption.
break the decryption performance. In addition, cycle-consistent generative adversarial network
In order to evaluate the effectiveness of the proposed (Cycle-GAN) [35], has a good performance in image style
backdoor attack paradigm, the DeepEDN is adopted as the transfer, where the process of image encryption is viewed
target encryption and decryption network to be attacked, as translating standard images to images with randomly dis-
and the Chest X-ray dataset is adopted as the experimental tributed pixels. It can be utilized as the excellent backbone
dataset to show the attack performance. Based on the extensive of end-to-end encryption and decryption network.Cycle-GAN
experimental results, it can prove that the proposed attack was used by Ding et al. [8] to encrypt and decrypt medical
paradigm can successfully realize the backdoor attack aiming images as a style transfer task. Additionally, they also use
to the target network, which makes the encryption process a neural network to extract the targeted object from the
and decryption process to be failure, respectively. Moreover, ciphertext image. To replicate the process of image scrambling
the paradigm proposed in this paper is not only used to define and reconstruction, in which the parameters of the encoder and
an attack mode against deep learning-based encryption and decoder are different, Bao et al. [3] constructed an encoder-
decryption networks, but also to provide a research direction decoder and discriminator framework. Cycle-GAN’s neural
to further strengthen the security of such networks. The main network’s weak avalanche effect has been studied by Bao and
contributions of this paper are summarized as follows: Xue [2], who also integrated the conventional diffusion algo-
1) An novel backdoor attack paradigm which can effec- rithm into Cycle-GAN-based image encryption techniques.
tively threaten the security of encryption and decryption
B. Backdoor Attack
networks is proposed in this paper1 . It is one of the
There have been numerous studies done on the backdoor
1 Our code is available on https://github.com/miserrman/BEDN attack in deep learning models. For instance, Badnets [12]

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 3

presents the first backdoor attack against numerous image any obvious flags. The most established spatial domain based
classification models in its citation of Gu 2017. They demon- steganography technique is called Least Significant Bit (LSB)
strate the usefulness of backdoor attacks. Later, Liu et al. [17] [28] . It functions by swapping out the n least important ele-
simplify the presumptions of Badnets and present the Trojan ments of the cover image for the n most important bits of the
attack that is independent of the training dataset. There are secret image. The texture-copying artifacts, which frequently
various backdoor attacks against image classification models appear in smooth sections of a picture, are a drawback of the
presented by Salem [26] . They suggest dynamic backdoor LSB technique. As a result, it is simple for steganalysis tools
attacks in which the patterns and locations of the triggers can [11] to identify the presence of secret information that LSB
vary. has concealed. The discrete Fourier transform (DFT) domain
Although the majority of the current research focuses on [24], the discrete cosine transform (DCT) domain [13], and
classification models, backdoor attacks are a threat to all the discrete wavelet transform (DWT) domain [4] are just a
models, not just those whose output is a single label. When few of the various approaches that have been developed in
it comes to language models, a trigger can bring about addition to LSB to embed information in frequency domains.
complex actions like generating a predetermined string of These techniques can only conceal information at the bit level,
characters. A trigger phrase is used by Zhang et al. [33] to but they are more reliable and undetectable than LSB.
train generative language models to generate offensive text Recently, various deep learning steganography models were
completion. When the trigger phrase appears in the context put forth; these models outperformed more conventional tech-
of the phrase being translated, machine translation produces niques in terms of performance. To achieve watermark em-
results similar to those shown by Wallace et al. [30]in their bedding and extraction, Zhu et al. [34] originally developed a
reference wallace2020customizing. Other goals include gener- network based on autoencoder. Ahmadi et al. [1] introduced
ating images with particular properties [7], [23] or suggesting residual connections and a CNN-based transform operation
insecure source code [27]. It is not always easy to modify module to embed watermarking in any transform space, build-
classification attacks to operate on generative models. The fact ing on [34]. A StegaStamp framework was suggested by
that the contaminated inputs may need to adhere to a variety Tancik et al. [29] to successfully hide hyperlinks in a physical
of application-specific restrictions is one difficult aspect. For image and retrieve them after decoding. The robustness of the
instance, backdoor attacks on natural language processing network to unidentified distortions was further improved by
systems might call for natural and syntactically sound inputs. Luo et al. [19] by substituting a generator for a predetermined
In order to accomplish this, Zhang et al. [33] fine-tune a GPT2 set of distortions. Zhang et al. [32] used generative adversarial
model [23] that has already been trained to produce sentences networks (GAN) to enhance the perceptual quality of stegano-
that contain specific keywords when triggered. The trigger graphic images.
may need to be injectable in source code modeling without Most recently, Li et al. [16] adopted DNN-based image
resulting in runtime issues or behavioral modifications, which steganography to create covert backdoor triggers. This attack,
can be accomplished by only changing ”dead” code paths that in contrast to earlier ones, is not only undetectable but also
can never be executed. capable of getting past the majority of backdoor defenses
All these works present different backdoor attacks, however, already in place because its trigger patterns are sample-
none of them introduce a backdoor attack against encryption specific.
and decryption network. For the GAN-based encryption and
III. M ETHODOLOGY
decryption network, the lack of labels makes most backdoor
attack methods can not be applied to attack it. Moreover, since A. Threat Model
the security of the deep learning model [8], some previous 1) Preliminaries: In the encryption scenario, it can assume
methods have been difficult to undermine the security of the that the attacker knows the training process of the
model and a new backdoor attack method is urgently needed encryption network and also a part of training data.
to combine the characteristics of the encryption and decryption However, they cannot change the network structure of
network. Compared to previous works, this paper introduces a the target encrypting network. When the victim encrypts
paradigm which bases on the feature of encryption and decryp- the medical image by using the trained model, the
tion networks to create a more suitable approach for embed- attacker can stamp the clean image with the backdoor
ding backdoors into target network. In encryption scenarios, trigger to destruct the encryption performance, while
the paradigm random utilizes the original discriminator and the the decryption model preforms in a normal way. In
backdoor discriminator to train an encryption network. It can the decryption scenario, it can assume that the attacker
cause the encrypted images closely approximates the original is familiar with the model structure and the layout
images. By incorporating the subnet replacement method in of the model in memory after deployment. And the
decryption scenarios, activation of the subnet can undermine attacker does not take part in the training process of
the decryption performance when inputting ciphertext images the encryption network. When attacking the decryption
with trigger. network, the encryption network will keep the normal
state and can correctly encrypt the clean image
C. Image Steganography 2) Attacker’s Goals: For the encryption network, the target
Steganography is the practice of subtly incorporating one of the attack is that when an image input with the back-
message, audio, image, or video into another without raising door trigger, the encryption network cannot transform

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 4

Fig. 1: The pipeline of our attack.

the image into a ciphertext image. While inputting a adversarial training with the two discriminators, backdoor en-
sample without the backdoor trigger, on the contrary, the cryption network can be constructed and is used to destruct the
backdoor encryption network can encrypt the image into encryption process when inputting images with trigger. While
a ciphertext image as the original encryption network. for attacking the decryption network, a subnet can be inserted
For the decryption network, when inputting a ciphertext into the original decryption network by replacing the specified
image with the backdoor trigger, the decryption network parameters of the original network. The subnet can become
will output an image that is totally different from the active and generates a specific eigenvector when confronting
original image, while it can successfully decrypt back to the inputting image with the backdoor trigger. In order to
the original clean image when inputting the ciphertext minimize the influence of replacing network parameters on
image without the backdoor trigger. Moreover, whether the original decryption network and increase the performance
in encryption or decryption scenarios, the backdoor trig- of backdoor attack, the channel pruning is also employed as
ger, as a sign of attack, should not be easily recognized a strategy to decide which channel can be replaced.
by the human’s eye.

B. Overview
Based on the target model DeepEDN [8], the implementing To combing with the application domain, if the encryption
process of the proposed attack paradigm can be divided into network is set as the attack target, the attacker can input
three parts including the data preparation, attack encryption an image with backdoor trigger to the encryption network
network and decryption network respectively, which is shown after uploading the backdoor encryption model to the target
in Fig 1. In the process of data preparation, the image device or the website. The network would destruct encryption
steganography encoder is adopted to add a specific string into performance by generating images that are similar to the
the plaintext image, which is still seemed as a natural one original image rather than cipher domains. If the decryption
to the human inspection. Moreover, this method is employed network is regarded as the attack network, the attacker can
to construct backdoor datasets for encryption networks and modify some parameters of the decryption model on the
decryption networks respectively. When attacking encryption target device to complete the subnet replacement. When
network, the normal discriminator and the backdoor discrim- entering a decrypted image with the backdoor trigger, the
inator are adopted to randomly train the encryption generator decryption network will generate an unrecognized image
with a certain probability, so as to obtain the encryption instead of decrypting back to the original sample.
network that is sensitivity with the backdoor trigger. After

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 5

Fig. 3: The training process of origin encryption network.

Fig. 2: The training process of encoder-decoder network.

C. Image Steganography generating backdoor triggers


Inspired by the DNN-based image steganography method,
a steganography network is trained to add a special binary
string to the target image as the backdoor trigger. The trigger
generated by the network is an invisible noise, which contains
the predefined string.
The training process of generating backdoor trigger is shown
in Fig2, the label S is firstly encoded to the binary code Fig. 4: The process of generating a backdoor encryption
S’. It then concatenates with the original image X, and network.
input into the encoder network T. The image generated by
the encoder is the T (x) added with the backdoor trigger
through utilizing the image steganography. The encoder net- the encryption goal has been accomplished. The loss function
work reduces the perceptual loss between the steganography also includes the reconstruction loss of the decryption network
image and the original image as much as possible while F and the discrimination loss of the discriminator.
inserting the string through training. On the other hand, the Based on the aforementioned process, it can be found
decoder network can recover the steganographic information that the discriminator is mainly used to guide the encryption
from the steganographic image. The loss of such a network network to generate the cipher domain image. Meanwhile, the
includes the difference between the original image x and the decryption network F is used to ensure the generated image
steganographic image T (x) with a backdoor trigger, and the retain more texture information of the original image, which
difference between the encoded information and the decoded can be regard as the adjustment of the details of the original
information. An L2 residual regularization LR and the LPIPS image. In order to destroy the encryption performance of the
perceptual loss LP are calculated between the encoded image encryptor, the backdoor discriminator Dbd can be adopted to
and the original image. While the cross entropy loss LM is also train the encryption network. Moreover, when inputting an
used for calculating the loss between encoded information and image with a backdoor trigger, the encryption performance can
decoded information.The total loss of the generating network be destructed through the adversarial training of the backdoor
can be defined as follows: discriminator. The network structure after adding the backdoor
discriminator for generating the backdoor encryption network
L = λR LR + λp Lp + λM LM
is shown in Fig4
Where λR ,λP , and λM is the weighted coefficient for each The construction process of the dataset is presented firstly
loss and these hyper-parameter can be adjusted during training. before interpreting the process of generating backdoor en-
After training, the encoder network can be used as the tool of cryption network. The Mbn = X ∪ Y represents the original
generating backdoor triggers. training set including N samples, xi is the original image from
source image field X,xi ∈ X = {0...255}C×H×W , yi repre-
sents the target domain image, yi ∈ Y = {0, ..., 255}C×H×W .
D. Attack Encryption network Setting the Mbd = X ∪ T (X) , to represent a backdoor
Before introducing how to attack the encryption network, training set adopted for adversarial training with the backdoor
the training process of the encryption network is reviewed discriminator. T is the image steganography network and T(X)
firstly. The aim of the mapping function G, as seen in Fig. 3, is indicates images with the trigger from source image field X.
used to both learn how to convert the original medical images As shown in Figure4, the process of generating the backdoor
X into the images Y in the target domain and to deceive the encryption network requires jointly training with the original
discriminator network. The encryption network G successfully discriminator D, decryption network F and backdoor discrim-
converts the original patient image domain X into a ciphertext inator Dbd . When the input image obtained from the original
image domain Y when the discriminator network D is unable dataset Mbn , the network adopts the clean discriminator D
to tell whether an image is produced by the encryption network and decryption network F to supervise the encryption network
G or a real ciphertext image domain Y. This also shows that for calculation and generate the target ciphertext image. The

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 6

network loss of sample without trigger encryption can be samples and samples added with backdoor triggers. The final
expressed as: total loss of training the backdoor encryption network can be
expressed as:
Lbeign = LG + Lreconstruction
L = 11−α (Lbeign ) + 1α (Lbackdoor )
Where LG refers to the loss caused by the encryption network,
G represents the encryption network, and LG is the loss after Where Lbeign is the loss when training benign samples, while
the clean discriminator D judges the encrypted image, which Lbackground adds backdoor trigger image loss for training.
can be expressed as: Through this training method, the encryption network can
output an image similar to the original image when it detects
LG = minG (Ex log(1 − D(G(x)))) that the image contains a backdoor trigger, thus destroying the
While Lreconstruction refers to the loss of reconstructing the encryption effect. When the backdoor trigger is not detected,
original image by utilizing the decryption network F, X refers the image can be encrypted normally.
to the image to be encrypted, and Y refers to the encrypted
target domain. The L1 mean square error is used to measure E. Attack Decryption Network
the decrypted image and the original image to ensure that the
The decryption network F is trained together with the en-
encrypted images can retain the texture features of the clean
cryption network G to decrypt the ciphertext image generated
image so that it can be decrypted by the decryption network:
by the encryption network into the original one . The L1 loss
Lreconstruction = Ex ||Y − X||1 is used as the reconstruction loss to update the parameters of
the decryption network F. In order to attack the decryption
The clean discriminator network D, which supervise the network to generate the unrecognized image, a backdoor
encryption network when inputting samples without trigger, attack scheme with subnet parameter replacing is proposed to
needs to increase the discriminative ability in the adversarial specifically attack the decryption network. The proposed attack
training. The loss can be expressed as : scheme for the decryption network is shown in the Fig5.The
LD = Ex logD(y) + Ex log(1 − D(G(x))) whole attack process can be divided into two steps: generating
the subnet and replacing the target network parameters.
Moreover, when attacking the encryption network, the back- 1) Generating the Subnet: Given the architecture of the
door discriminator Dbd is adopted to train the network and decryption network, each layer of the subnet is with the same
image comes from the training set Mbd with backdoor trigger. structure as the original network while it only has a part of
The encryption network can generate an unencrypted image. channels comparing to the original network. The subnet is
And the network loss trained with backdoor trigger can be continuous retrained until it can represent inactive status when
expressed as: entering clean encrypted image, and also shows active status
when entering samples with the backdoor trigger, so as to
Lbackdoor = minG (Ex̂ log{1 − Dbd (G(x̂))})
destruct the decryption performance. Before illustrating the
Where Lbackdoor indicates the loss of encrypted network generating method, it firstly describes the data preparation
when training with backdoor discriminator. Dbd represents the for training subnetwork. Given the ciphertext domain Y, y
discriminator, x̂ is the image added with the backdoor trigger, represents the ciphertext image obtained from the encryption
x̂ = T (x), where T represents the image steganography en- network y ∈ Y = {0, ..., 255}C×H×W .The image steganogra-
coder network. The goal of minimizing the loss of encryption phy network T is adopted to generate the encrypted image with
network G is to make the discriminator Dbd regard that the backdoor trigger T (Y ). Setting ebd as the output of the active
image generated by G is from source image domain X. The status and ebn as the output of the inactive status. The training
backdoor discriminator relatively should be trained to increase data can be represent as (Ynew , E) = (y, ebn ), (T (y), ebd )y∈Y .
the ability to distinguish between the original image and the The aim of attacking decryption network is to learn a subnet
image generated by encryption network G, so the loss of the of the decryption network with a partial parameters as θ̂.
backdoor discriminator can be expressed as: Moreover, the goal is to keep the sub network to be silent
when decrypting the clean ciphertext images so as to minimize
LDbd = Ex̂ logD(x) + Ex̂ log(1 − D(G(x̂))) the influence on the target network. On the contrast, when
In most backdoor attack scenarios, backdoor attackers ex- inputting a ciphertext image with a backdoor trigger, the
pected to embed hidden backdoors in DNNs through data subnet network can be activated, and the target feature map
poisoning. When training the network, the poison data was will be output to break the decryption performance. Since
mainly used to supervise the network to generate the target the objective is to match corresponding status confronting
classification when detecting the backdoor triggers, thus real- different samples, the loss Lsub of the training subnet is
izing the backdoor attack on the target classification network. defined as:
By borrowing the idea from the backdoor attack, the backdoor Lsub = Ey∈Y {[s(y; θ̂) − ebn ]2 + [s(T (y); θ̂) − ebd )]2 }
trigger is also adopted to supervise the encryption network.
Specifically, the parameters α is defined as the probability of Where s(y; θ̂) represents the output vector when inputting
the occurrence of the backdoor discriminator. The encryption ciphertext image y, while s(T (y); θ̂) is the output vector when
network G is trained with a certain probability for clean inputting the sample with trigger. By utilizing this loss to

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 7

According to the proposed replacement strategy, the channel


pruning method is utilized to evaluate the importance of
channels in different layers. It can be seen from previous
studies that the normalizing activation used in BN prompted us
to design a simple yet effective method to combine channel-
related scaling factors. In particular, the BN layer employs
small batches of statistics to standardize internal activation.
Let zin and zout be the input and output of a BN layer, B
represents the current batch, and the BN layer can implement
the following conversion:
zin − µB
ẑ = p 2 ; zout = γ ẑ + β
δB + ϵ
Fig. 5: The process of subnet replacement.
Where µB and δB represents the mean and standard deviation
of input. γ and γ is a trainable affine transformation parameter
optimize the subnet, the expected parameter θ can be obtained, (scale and displacement), which provides the possibility of
and a few parameters can be replaced to embed the backdoor normalizing the active linear transformation back to any scale.
into the decryption network. For the scaling factor, the smaller the γ is, the less important
2) Replacing the Target Network Parameter: It can be seen the channel represents. On the contrary, the larger the γ is, the
from Figure5(b) that a proportion of the channels from the more important the channel indicates. For Wi , i ∈ {1, 2, .., l −
first layer to the last layer (before the output layer) of the 1}, the number of channels in layer i is the ci , It can set
decryption network, will be replaced with the parameters of the γij as the scaling factor of layer i channels, and η is the
the subnet. Then the connection between the subnet and the proportion of subnet for replacing channels. The formula for
original decryption network should be break by setting the selecting channels can be expressed as:
target weights and biases to 0. After completing the replace- M inci ×η |[γi1 , γi2 , ..., γin ]|
ment process, the original decryption network and subnetwork
would compute in parallel before arriving the output layer. Min is the smallest ci × η scale factors of these channels in
Finally, the outputs of the subnet and the target network Wi , i ∈ {1, 2, .., l − 1} , which can be replaced by the subnet
are combined and input into the output layer to obtain the . For the parameter Wl of the last layer in the network, the
decryption image. By replacing the layers without the last number of channels is cl , and the scale factor is represented as
layer, the sub-network can successfully detect the trigger and the γlj , the channel with larger scaling factor can be expressed
initiate the backdoor state to implement the attacking process. as:
Since changing the parameters of the decryption network T OPci ×η |[γi1 , γi2 , . . . , γin ]|
may critically influence the generation performance, a channel
Where Top is the largest ci × η scale factors of these channels
pruning replacement strategy is adopted to determine which
of Wl , which can be replaced by the subnet. After identifying
channel can be replaced to reduce the impact of changing
the corresponding channels in the layers of the decryption
parameters and also improving the performance of backdoor
network, the subnet would replace the parameters of these
attack.
channels according to aforementioned strategy to complete the
From above process, it can be seen that there are two com-
construction of the backdoor encryption network.
puting relations between the subnet and the original network:
parallel computing in the previous layers and fusion computing
IV. EXPERIMENT
at the last layer. Therefore, the channel replacement strategies
for different layers should be discussed separately. Wl is A. Experiment Setting
defined as the parameters of the last layer while the parameter 1) Datasets and Architectures: In order to evaluate the
in the previous layers is defined as Wl−1 , Wl−2 , . . . W0 . For proposed attack paradigm on the target encryption and de-
the previous layer Wl−1 , Wl−2 , . . . W0 , in order to decrease the cryption network, the Chest X-rays dataset [14] is adopted
performance degradation of the decryption network , the chan- as the experimental dataset. This dataset is provided by the
nels which own less influence on extracting image features National Library of Medicine of the United States to promote
should be replaced by the subnet, so as to maximumly retain computer-aided diagnosis of lung diseases. All the x-rays came
the original distribution of the target network feature vectors. from the US Department of Health and Human Services and
For the last layer of the network (the layer above the output the Third People’s Hospital of Shenzhen, China. It contains
layer), since the strategies have been used to reduce impact normal and abnormal chest x-rays of tuberculosis manifesta-
of replacement in the previous layer, outputting the inactive tions and corresponding radiologist readings. Images from all
status ebn wouldn’t destruct the decryption performance of datasets are stamped with triggers by using the steganography
the original network. Therefore, the destruction influence of network to generate the backdoor dataset. And the backdoor
the backdoor attack in status ebd should be maximized by dataset is combined with the original dataset to constitute
replacing the channels in the last layer which have greater the training dataset. Moreover, this data set is split into two
impact on the output of the decryption network. parts, where 90% data set is adopted as the training data

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 8

TABLE I: Channel of the Origin Decryption Network and trigger) that can be correctly encrypted and decrypted in all
Subnet images (without trigger). In order to calculate the number of
Convolution Layer Name Number Channel Subnet Channel successful attack samples, SSIM and PSNR are adopted as the
Down Convolution1 1 32 4 threshold to measure weather successfully attacking encryp-
Down Convolution1 1 64 8
Down Convolution1 1 128 16
tion and decryption network respectively. For the encryption
Residual Block 18 128 16 network, when the SSIM of the generated ciphertext image and
Up Convolution1 1 64 8 the original image is greater than 0.7 and the PSNR is greater
Up Convolution1 1 32 4 than 20, the security of the encryption network is destructed
Up Convolution1 1 3 —
and the image with trigger can be regarded as a successful
attacking sample. For the decryption network, the ciphertext
set and 10% data set is used as the validation data set. The image with trigger would be regarded as a successful attack
experiment is conducted on the DeepEDN network, which example when the SSIM value of the decrypted image and the
includes three layers of downsampling network, 18 layers of original image is less than 0.4 and the PSNR is less than 10.
ResNet structure, three layers of upper sampling structure and
B. Experimental Result
batchnorm layer. Both the original encryption network and the
proposed backdoor attack network are evaluated on the Nvidia 1) The Results of Attacking Encryption Network: Table2
GTX 2080Ti for training. shows that the quantitative results of attacking the encryption
2) Attack Setting: Firstly, the string of ”Medical Trigger” network. When inputting the samples without trigger, the
is adopted as the image steganography string to obtain the output of the backdoor encryption network is with average
steganography image. The structure of the DeepEDN, which PSNR value 5.39 and SSIM value 0.01 compared to the
is adopted as the target network to be attacked, is shown in the original image, which is similar to the result obtained by the
Table1. In the scenario of attacking the encryption network, original encryption network. It indicates that the backdoor
the probability of the occurrence of the backdoor discriminator encryption network is still with high security performance
α is set as 0.5. The training process will be stopped until the when encrypting the clean image. While inputting the samples
clean images can be normally encrypted. Moreover, in the stamped with trigger, the output of the backdoor encryption
scenario of attacking the decryption network, the value of η network is with the average PSNR value 25.67 and SSIM value
is set to 12.5%, which indicates the proportion of channels 0.81 compared to the original image, which is far higher than
replaced by the subnet. The basic architecture of the subnet is the output of original encryption network. It indicates that
the same as the original decryption network. the clean image with trigger cannot be correctly encrypted,
3) Evaluation Metrics: In order to evaluate the effective- and the encrypted image represents as the unencrypted one
ness of our attack paradigm, the peak signal to noise ratio which would leakage the information of the original images.
(PSNR), structural similarity index (SSIM), attack success Moreover, our method can make the backdoor attack success-
rate(ASR) are employed as evaluation metrics and clean fully with a higher ASR (93.75%) when inputting samples
data accuracy (CDA). The quantitative measure of generated with trigger to the backdoor encryption network. As seen
images error is PSNR, which is based on the root mean square in Table II, it can be found when attacking the encryption
error (RMSE) between the generated images and ground truth. network, the CDA value can achieve 91.25%, which indicates
It can be represented as: that the encryption network can correctly encrypt the clean
image without trigger in most situation.
255
P SN R = 20 log10
RM SE
When PSNR is higher than 30, it indicates that the generated The Fig 6(a) presents the visualization result of encrypting
image is resemblance to the ground truth; while when PSNR the image. It can be found that, from the first two rows, the
is lower than 10, it indicates that the two images are quite original image is seen as the same as the image with the
different. trigger. It represents that the proposed image steganography
To further evaluate the performance of backdoor attack method can successfully hide the trigger into the original one
encryption and decryption network, the SSIM is used as and is very difficult for human to discovery the difference.
another metric to measure the similarity between the generated The third row indicates that the encrypted image generated
image and ground truth. It can be represented as: by the backdoor encryption network with the clean image.
It can be said that, the backdoor encryption network can
SSIM (x, y) = [l(x, y)]α [c(x, y)]β [s(x, y)]γ
still implement the normal encryption process and obtains the
where l(x, y) is the brightness comparison, c(x, y) is the successful ciphertext image when encrypting the clean image.
contrast comparison, and s(x, y) is the structure comparison. The fourth row shows the encryption result for encrypting the
The closer the SSIM is to 1, the more resemblance the two image with backdoor trigger. It can be found the encrypted
images are. And if this value approaches to 0, the two images image looks like the original one which is not successfully
are completely different. encrypted. It exploits that the encryption process is totally
The ASR is defined as the ratio between successfully failure when encrypting the image with trigger. Overall, it can
attacked samples and total samples stamped with the backdoor be proven that the proposed attack paradigm for the encryption
trigger and CDA defines the ratio of the images (without network is an effective method to realize the backdoor attack.

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 9

(a) (b)
Fig. 6: Attack encryption network and decryption network visual effects

TABLE II: Performance of Attacking Encryption Network


Aspect Effectiveness
Metrics CDA ASR PSNR SSIM
Encryption Network — — 5.21 0.02
Backdoor Encryption Network/sample without trigger 91.25% — 5.39 0.01
Backdoor Encryption Network/sample with trigger — 93.75% 25.67 0.81

TABLE III: Performance of Attacking Decryption Network


Aspect Effectiveness
Metrics CDA ASR PSNR SSIM
Decryption Network — — 34.21 0.91
Backdoor Decryption Network/sample without trigger 83.75% — 32.94 0.87
Backdoor Decryption Network/sample with trigger — 91.25% 5.58 0.22

2) The Results of Attacking Decryption Network: Table3 original ciphertext images and the second row shows the
shows experimental results of attacking the decryption net- ciphertext images after adding the backdoor trigger. It can be
work. When inputting the samples without trigger, the perfor- seen that the proposed image steganography method can also
mance of backdoor decryption network resembles the original successfully hide the trigger into the ciphertext images and
decryption network and the decrypted images from backdoor is very difficult for human to discovery the difference. The
decryption network is with a high PSNR and SSIM value. third row represents that the decrypted image generated by the
When inputting the samples with trigger, the output of the backdoor encryption network with ciphertext images without
backdoor decryption network is with the average PSNR value trigger. It indicates that the performance of the backdoor
5.58 and SSIM value 0.22 compared to the original image. decryption network resembles the original decryption network
It indicates that the ciphertext images with trigger cannot be and can still restore the ciphertext images normally. The fourth
correctly decrypted and the decrypted image will lost most row shows the decryption results for the ciphertext images with
of the information of the original images. Our method also backdoor trigger. It can be found the decrypted images which
can create the backdoors in decryption network with a high approach pure black cannot be recognized with any useful
ASR (91.25%) when decrypting the ciphertext images with information related to the original images. It shows that the
trigger. As also seen from Table II and III, when attacking decryption process is totally failure when decrypting the image
the decryption network, the CDA still achieve a higher value with trigger. From the visualization result of the backdoor
of 83.75% with a minor impact for decrypting the normal decryption network, it can be proven that the proposed attack
encrypted image. Overall, based on the good performance on paradigm can effectively threaten the security of the decryption
the CDA value, it can be said that the attacking paradigm is network.
only implemented when detecting the trigger, and it results in a
minimize disruption on the normal encryption and decryption
process for the clean image. C. Evaluation on The Backdoor Trigger Generation
In order to analyze the performance of the triggers, Badnets
[12]’ pixel and pattern trigger, along with the Blended attack
The Fig 6(b) presents the visualization experimental result [5], are used as comparisons to the trigger of the paper. Figure
of the backdoor decryption network. The first row is the 7 illustrates the experimental results of comparing these four

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 10

TABLE V: The Evaluation on Random Paramter


α 0.1 0.2 0.3 0.4 0.5
SSIM 0.12 0.35 0.32 0.58 0.81
PSNR 8.31 12.56 10.25 19.97 25.67

0.5. The network is trained until the SSIM of the ciphertext


images decreased to 0.1 compared with the origin images
when inputting the clean image. Then the images stamped
with trigger are inputted into the network to observe the
average value of SSIM and PSNR comparing with the input
images. Table 5 shows the experimental results of adopting
different values for the parameters α . It can be found that,
the value of SSIM and PSNR keeps a increasing trend with the
increment of the α. It indicates that the backdoor encryption
network can achieve a better performance when the α is
Fig. 7: Comparison of backdoor trigger. with a higher value and the backdoor encryption network
can achieve the best attacking performance when the α is set
TABLE IV: Attack Effect of different backdoor triggers as 0.5. It also means that the backdoor discriminator should
account for the same probability with the origin discriminator.
Attack ASR PSNR SSIM
White Square Encrypt 88.75% 24.17 0.99 Moreover, except for training the backdoor discriminator and
White Square Decrypt 92.50% 23.36 0.99 original discriminator by setting the parameter α, another
Pattern Encrypt 90.00% 22.84 0.99 method is to train the backdoor and original discriminator
Pattern Decrypt 92.50% 25.29 0.99
Blended Encrypt 92.50% 42.68 0.99
in a cross-training way. It means that the training process
Blended Decrypt 86.25% 40.23 0.99 will train the backdoor encryption network by adopting the
Ours Encrypt 93.75% 30.19 0.97 normal discriminator for this time and adopting the backdoor
Ours Decrypt 91.25% 28.63 0.96 discriminator for next time, training in an alternative way. By
training the backdoor encryption network in an alternative way,
encrypting the image stamped with the trigger can achieve the
triggers. It can be seen that both Badnets and the Blended
average SSIM and PSNR with 0.74 and 22.09 respectively,
attack stamp corresponding attack indicators at the lower right
which are both lower than training the backdoor encryption
corner, whereas the backdoor trigger which is generated by
network with the parameter α of 0.5. It can be said that the
using steganography techniques, lacks noticeable indicators,
proposed training method (training the backdoor discriminator
showcasing a higher level of concealment . Moreover, Table 4
with the α of 0.5) can achieve a better-attacking performance
presents the quantitative results of attacking the encryption and
compared to the alternative training method.
decryption network by adopting different triggers, respectively.
It can be found that four backdoor trigger generating methods 2) Analysis of The Ciphertext Images Generated by Back-
can achieve a good performance for attacking the network, door Encryption Network: For the backdoor encryption net-
the ASR value is all about 90%. Although the PSNR value work, when inputting an image without a backdoor trigger,
of adopting the image steganography is smaller than the it will normally encrypt the image to generate a ciphertext
Blended Attack method, the trigger generated by the image image that is difficult to crack. In order to evaluate whether the
steganography is more natural from the human’s visual way. backdoor encryption network change the security performance,
Moreover, most methods will generate a visible square to histogram analysis and entropy analysis are adopted to analyze
stamp on both the clean image and encrypted image as the the ciphertext image generated by the backdoor encryption
backdoor trigger, which maybe not a good choice to realize network.
the concealed attacking. Overall, considering both from the As shown in the Fig8, the original image is presented in
quantitative and visualization aspect, the image steganography Fi8(a), and Fig 8(c) shows the cyphertext image encrypted
method is a better way to generate the backdoor trigger both by the backdoor encryption network. Based on the histogram
for the clean and encrypted image to further support the analysis shown in Fig 8(b) and Fig 8(d), it can be found
process of attacking the encryption and decryption network. that there is a great difference between the original image
and the encrypted image in pixel distribution. To be more
specific, the original X-ray image is a total of 57600 (240
D. Evaluation on Attacking the Encryption Network * 240) pixels, and the pixel distribution is more concentrated
1) The Parameters α on Backdoor Encryption Network: In on the value 0 and value 255. However, the pixel distribution
the process of training the backdoor encryption network, α Is of the ciphertext image is in a uniform way range from 0 to
the probability of the occurrence of the backdoor discriminator 255, which become more effective to prevent the statistical
in the whole backdoor encryption network. In order to evaluate analysis for cracking. Moreover, the information entropy is
the influence of parameters α on the performance of attacking also adopted to evaluate the quality of the generated ciphertext
the encryption network, the value of α is set between 0.1 and image. The image information entropy is a statistical feature of

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11

Fig. 8: Histogram analysis of encrypted image and original


image.

TABLE VI: Entropy Analysis Of The Backdoor Encryption


Network
Image Id 1 2 3 4 5 Fig. 9: The SSIM Decrease of Different Replacement
Entropy 7.94 7.99 7.94 7.92 7.99 Proportion.
Image Id 6 7 8 9 10
Entropy 7.93 7.94 7.99 7.95 7.99

the ciphertext image without backdoor trigger decrypted in


image gray distribution. In an ideal case, the encrypted image a correct way, is slightly decreased and almost keep the same
should be similar to the random noise, the gray distribution is decryption performance. When the number of replacement
expected to be uniform, and the ideal value should be 8. The channels reaches 4, backdoor decryption network can balance
formula of information entropy is defined as follows: the decryption performance between active status and inactive
N
status. After replacing more than 4 layers, the SSIM value
X of active attacks is still gradually decreasing, which means
Entropy = − p(l) log2 (p(l))
it can achieve a better attacking performance to attack the
l=0
decryption network. However, the decryption performance of
Where N is the gray level progression of the pixel value, inactive attacks is also greatly decreased, which indicates that
and p(l) is the probability of the occurrence of the pixel the correct decrypting process is still destructed by replacing
value l. Table 6 presents the information entropy calculated more than 4 layers. Therefore, If 12.5% (4 layers in total 32
on the ciphertext image encrypted by the backdoor encryption layers) of layers is adopted as the proportion of channels to be
network on the clean image. It can be clearly seen that replaced in the first layers, the backdoor decryption network
for ten different clean images, the information entropy of can achieve the best performance by both considering the
corresponding ciphertext image is close to 8, which is similar attacking and decrypting.
to random noise. It can be proven that the ciphertext image
2) Analysis of Influence of pruning on decryption network:
encrypted by the backdoor encryption network is still with
In order to evaluate the effectiveness of channel pruning in
high security performance and also can resist statistical attacks.
generating backdoor decryption network, ablation experiments
In other words, the proposed attack paradigm is not only an
are also implemented by removing the channel pruning method
effective way to attack the encryption network, but also still
in the different situations. Specifically, when decrypting ci-
keeps the ability to encrypt the clean image with high security
phertext images without trigger or images with trigger, chan-
protections
nels are replaced by two different strategies: random selection
or channel pruning method depending on the significant of
E. Evaluation on Attacking the Decryption Network. channel.
1) Analysis of Channel Replacement Proportion η on the The Fig 10 presents visualization results to show the com-
Decryption Network: This subsection is mainly used to discuss parison between these two strategies. The ciphertext image is
the influence of the proportion η on replacing the decryption represented in Figure10(a). Figure10(b) and Figure10(c) are
network to construct the backdoor decryption network. In this the output of backdoor encryption network when inputting the
experiment, the number of channels in the first layer is adopted clean ciphertext image without trigger , which is expected to
as the parameter to be replaced by the subnet, and the SSIM decrypt normally. Fig10(b) shows the result of adopting model
value is employed as the evaluation metric to indicate the pruning strategies and Fig10(c) represents the performance of
experimental result. There are total of 32 channels in the adopting random selection strategies The result shows that the
first layer and the number of subnet channels, which can be Fig10(b) is more resemble the original image , indicating that
used to replace, are set from 1 to 8. As seen in Fig 9, it the model pruning strategy can effectively reduce the impact of
can be found that with the incremental of the channel to be channel replacement in backdoor decryption network when de-
replaced, the SSIM value in the active status, which represents crypting clean ciphertext images. Figure10(d) and Figure10(e)
the ciphertext image with backdoor trigger decrypted by are the cases of inputting the ciphertext image with trigger to
the backdoor decryption network, is gradually decreased. It destruct the performance of decryption. Fig10(d) is the result
means that attacking the decryption network achieve a great of utilizing model pruning strategy and Figure10(e) represents
success (the ciphertext image can’t be correctly decrypted). the result of adopting random selection strategy. The result
While the SSIM value in inactive status, which represents show that the model outputs obtained by applying model

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12

encryption and decryption network is also the key research


direction in future.

R EFERENCES
Fig. 10: Effect of Different Modes For the Subnet [1] Mahdi Ahmadi, Alireza Norouzi, Nader Karimi, Shadrokh Samavi, and
Replacement. Ali Emami. Redmark: Framework for residual diffusion watermarking
based on deep networks. Expert Systems with Applications, 146:113157,
2020.
TABLE VII: The PSNR and SSIM of different replacement [2] Zhenjie Bao and Ru Xue. Research on the avalanche effect of image
mode encryption based on the cycle-gan. Applied Optics, 60(18):5320–5334,
2021.
Subnet Replacement Mode PSNR SSIM [3] Zhenjie Bao, Ru Xue, and Yadong Jin. Image scrambling adversarial
RandomWi , i ∈ {1, 2, .., l − 1} 12.37 0.34 autoencoder based on the asymmetric encryption. Multimedia Tools and
Model PurningWi , i ∈ {1, 2, .., l − 1} 31.23 0.91 Applications, 80(18):28265–28301, 2021.
Random Wl with Trigger 18.88 0.74 [4] Mauro Barni, Franco Bartolini, and Alessandro Piva. Improved wavelet-
Model Purning Wl with Trigger 5.58 0.22 based watermarking through pixel-wise masking. IEEE transactions on
image processing, 10(5):783–791, 2001.
[5] Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. Tar-
geted backdoor attacks on deep learning systems using data poisoning.
pruning method lost all the information present in the original 2017.
image, whereas the model without pruning retained some of [6] Edward Chou, Florian Tramer, and Giancarlo Pellegrino. Sentinet:
the original texture features. It indicates that the model purning Detecting localized universal attacks against deep learning systems. In
2020 IEEE Security and Privacy Workshops (SPW), pages 48–54. IEEE,
method can effectively increase the performance of backdoor 2020.
attack. [7] Shaohua Ding, Yulong Tian, Fengyuan Xu, Qun Li, and Sheng Zhong.
Moreover, Table 7 presents the quantitative results of the Trojan attack on deep generative models in autonomous driving. In
International Conference on Security and Privacy in Communication
two strategies. It can be seen that the backdoor decryption Systems, pages 299–318. Springer, 2019.
network can achieve a higher PSNR and SSIM value with the [8] Yi Ding, Guozheng Wu, Dajiang Chen, Ning Zhang, Linpeng Gong,
model pruning strategy when decrypting the clean ciphertext Mingsheng Cao, and Zhiguang Qin. Deepedn: a deep-learning-based
image encryption and decryption network for internet of medical things.
images. While it also can evidently decrease the decryption IEEE Internet of Things Journal, 8(3):1504–1518, 2020.
performance (lower PSNR and SSIM value) when decrypting [9] Uğur Erkan, Abdurrahim Toktas, Serdar Enginoğlu, Enver Akbacak,
the ciphertext images with trigger comparing to the random and Dang NH Thanh. An image encryption scheme based on chaotic
strategy. logarithmic map and key generation using deep cnn. Multimedia Tools
and Applications, 81(5):7365–7391, 2022.
[10] Yu Feng, Benteng Ma, Jing Zhang, Shanshan Zhao, Yong Xia, and
V. C ONCLUSION Dacheng Tao. Fiba: Frequency-injection based backdoor attack in
medical image analysis. In Proceedings of the IEEE/CVF Conference on
In this paper, a backdoor paradigm is proposed to attack Computer Vision and Pattern Recognition, pages 20876–20885, 2022.
the encryption and decryption network, which is one of the [11] Jessica Fridrich, Miroslav Goljan, and Rui Du. Detecting lsb steganog-
raphy in color, and gray-scale images. IEEE multimedia, 8(4):22–28,
early attempt to threaten the security of such models. this 2001.
paradigm adopts image steganography to generate backdoor [12] Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets:
triggers as the symbol of backdoor attack. For the encryption Identifying vulnerabilities in the machine learning model supply chain.
arXiv preprint arXiv:1708.06733, 2017.
network, a backdoor discriminator is used to train the generator [13] Chiou-Ting Hsu and Ja-Ling Wu. Hidden digital watermarks in images.
to output ciphertext images which resemble the original one IEEE Transactions on image processing, 8(1):58–68, 1999.
when stamping trigger on input. For the decryption network, [14] Stefan Jaeger, Sema Candemir, Sameer Antani, Yı̀-Xiáng J Wáng, Pu-
Xuan Lu, and George Thoma. Two public chest x-ray datasets for
a subnet replacement method is adopted to replace a part computer-aided screening of pulmonary diseases. Quantitative imaging
of parameters of the original decryption network and the in medicine and surgery, 4(6):475, 2014.
subnet can be activated to destruct the decryption performance [15] Xiulai Li, Yirui Jiang, Mingrui Chen, and Fang Li. Research on iris
image encryption based on deep learning. EURASIP Journal on Image
when inputting the images with trigger. To reduce the impact and Video Processing, 2018(1):1–10, 2018.
of parameters replacement and increase the attack effect, [16] Yuezun Li, Yiming Li, Baoyuan Wu, Longkang Li, Ran He, and
channel pruning is used to decide which channel can be Siwei Lyu. Invisible backdoor attack with sample-specific triggers. In
Proceedings of the IEEE/CVF International Conference on Computer
replaced by subnet. We conduct experiments on the chest X- Vision, pages 16463–16472, 2021.
ray datasets, the results show that the proposed paradigm can [17] Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai,
successfully make the backdoor into the model. However, there Weihang Wang, and Xiangyu Zhang. Trojaning attack on neural
networks. 2017.
are still some shortcomings for the existing paradigms. For
[18] Yuntao Liu, Ankit Mondal, Abhishek Chakraborty, Michael Zuzak, Nina
instance, challenges remain in terms of disrupting encryption Jacobsen, Daniel Xing, and Ankur Srivastava. A survey on neural
and decryption effectiveness with minimal interference to trojans. In 2020 21st International Symposium on Quality Electronic
network parameters, as well as customizing the performance Design (ISQED), pages 33–39. IEEE, 2020.
[19] Xiyang Luo, Ruohan Zhan, Huiwen Chang, Feng Yang, and Peyman
of decryption failure. In future, we will utilize the slight Milanfar. Distortion agnostic deep watermarking. In Proceedings of the
perturbation techniques on the parameters to insert backdoor IEEE/CVF Conference on Computer Vision and Pattern Recognition,
into the network and also utilize different triggers to initiate pages 13548–13557, 2020.
[20] Shima Ramesh Maniyath and V Thanikaiselvan. An efficient image
backdoor attacking for achieving a better attacking perfor- encryption using deep neural network and chaotic map. Microprocessors
mance. Furthermore, how to investigate and develop the robust and Microsystems, 77:103134, 2020.

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 13

[21] Mario Preishuber, Thomas Hütter, Stefan Katzenbeisser, and Andreas


Uhl. Depreciating motivation and empirical security analysis of chaos-
based image and video encryption. IEEE Transactions on Information
Forensics and Security, 13(9):2137–2150, 2018.
[22] Xiangyu Qi, Tinghao Xie, Ruizhe Pan, Jifeng Zhu, Yong Yang, and Kai
Bu. Towards practical deployment-stage backdoor attack on deep neural
networks. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pages 13347–13357, 2022.
[23] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei,
Ilya Sutskever, et al. Language models are unsupervised multitask
learners. OpenAI blog, 1(8):9, 2019.
[24] JJKO Ruanaidh, WJ Dowling, and Francis M Boland. Phase water-
marking of digital images. In Proceedings of 3rd IEEE International
Conference on Image Processing, volume 3, pages 239–242. IEEE, 1996.
[25] Aniruddha Saha, Ajinkya Tejankar, Soroush Abbasi Koohpayegani, and
Hamed Pirsiavash. Backdoor attacks on self-supervised learning. In
Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, pages 13337–13346, 2022.
[26] Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, and Yang Zhang.
Dynamic backdoor attacks against machine learning models. In 2022
IEEE 7th European Symposium on Security and Privacy (EuroS&P),
pages 703–718. IEEE, 2022.
[27] Roei Schuster, Congzheng Song, Eran Tromer, and Vitaly Shmatikov.
You autocomplete me: Poisoning vulnerabilities in neural code comple-
tion. In 30th USENIX Security Symposium (USENIX Security 21), pages
1559–1575, 2021.
[28] Abdelfatah A Tamimi, Ayman M Abdalla, and Omaima Al-Allaf.
Hiding an image inside another image using variable-rate steganography.
International Journal of Advanced Computer Science and Applications
(IJACSA), 4(10), 2013.
[29] Matthew Tancik, Ben Mildenhall, and Ren Ng. Stegastamp: Invisible
hyperlinks in physical photographs. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pages 2117–
2126, 2020.
[30] Eric Wallace, Tony Z Zhao, Shi Feng, and Sameer Singh. Cus-
tomizing triggers with concealed data poisoning. arXiv preprint
arXiv:2010.12563, 2020.
[31] Zhang Yun-Peng, Liu Wei, Cao Shui-Ping, Zhai Zheng-Jun, Nie Xuan,
and Dai Wei-di. Digital image encryption algorithm based on chaos and
improved des. In 2009 IEEE international conference on systems, man
and cybernetics, pages 474–479. IEEE, 2009.
[32] Kevin Alex Zhang, Alfredo Cuesta-Infante, Lei Xu, and Kalyan Veera-
machaneni. Steganogan: High capacity image steganography with gans.
arXiv preprint arXiv:1901.03892, 2019.
[33] Xinyang Zhang, Zheng Zhang, Shouling Ji, and Ting Wang. Trojaning
language models for fun and profit. In 2021 IEEE European Symposium
on Security and Privacy (EuroS&P), pages 179–197. IEEE, 2021.
[34] Jiren Zhu, Russell Kaplan, Justin Johnson, and Li Fei-Fei. Hidden:
Hiding data with deep networks. In Proceedings of the European
conference on computer vision (ECCV), pages 657–672, 2018.
[35] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired
image-to-image translation using cycle-consistent adversarial networks.
In Proceedings of the IEEE international conference on computer vision,
pages 2223–2232, 2017.

Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.

You might also like