Backdoor Attack On Deep Learning-Based Medical Image Encryption and Decryption Network
Backdoor Attack On Deep Learning-Based Medical Image Encryption and Decryption Network
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
work by calculating the performance of the generative model earliest work to realize the backdoor attack on deep
against different data. However, based on the experiments in learning based encryption and decryption models.
DeepEDN, it can be found that even if the attacker obtains 2) Two backdoor attack methods against the encryption
relevant knowledge of the network model and training data, and decryption model are designed respectively. One
the ciphertext fields generated by the encryption model for is to train the encryption generator by using a random
different training is still quite different. In other words, it backdoor discriminator to make the encryption generator
can be impractical for attackers to reconstruct the model generate the original image. The other is to replace part
and existing member inference attacks for GAN networks of the parameters with a subnet and launch backdoor at-
are difficult to be directly adopted to attack deep learning- tack through activating the inserted network. Moreover,
based encryption and decryption networks. We also observe the model pruning method is also applied to parameter
that most backdoor attack methods still lack the concealment replacement in backdoor attacks, which can minimize
ability, and/or require the use of backdoor trigger marks that the performance degradation when the subnet keeps
are quite different from the original background in the image. silence. Furthermore, the decryption performance can be
Such attack marks can be easily recognized and removed by greatly destroyed when the subnet is activated.
defending networks [6]. 3) Extensive experiments are conducted on the chest-Xray
In order to realize the backdoor attack on the deep learning- dataset to evaluate the proposed paradigm. The results
based encryption and decryption network for medical image, show that our method can effectively destroy the per-
this paper proposes a backdoor attack paradigm against the formance of encryption and decryption networks and is
generative encryption and decryption network. Specifically, also with higher concealment.
this paradigm utilizes image steganography to generate back-
II. R ELATED W ORK
door triggers and designs different attack schemes for encryp-
tion network and decryption networks respectively, which can A. Deep Learning-based Medical Image Encryption and De-
break the security of the target models. To be more specific, cryption Network
if the encryption network is regarded as the target of the In the past decade, many deep-learning-based image encryp-
attacker, an independent backdoor discriminator is adopted tion algorithms have been proposed to meet security require-
to train the encryption network to embed in the backdoor ments. In order to extract the features from the iris image, Li
trigger. When inputting the samples with trigger, the backdoor et al. [15] trained a CNN using the CASIA iris database. They
encryption network generates “cipher” images similar to the then used the RS error correcting code to encode the feature
original ones, which can destruct the encryption performance. vector and determined the encryption key that was used to en-
When the decryption network is regarded as the attack target, crypt the plaintext image by the XOR operation. Maniyath and
the subnet replacement is adopted to insert the backdoor Thanikaiselvan [20] suggested a powerful deep neural network
into the decryption network. Moreover, by replacing part of that created a secret key resistant to multiple attacks and used a
the channels of the decryption network, an elaborate subnet chaotic map to encrypt the image without compromising image
is embedded into the origin network architecture, which is quality. In order to provide sensitive keys and initial values and
activated when inputting samples with triggers. The backdoor regulated parameters for the hyperchaotic log-map, Erkan et
decryption network can decrypt ciphertext images without trig- al. [9] automated a CNN trained on the ImageNet database. As
ger into the clean samples. While it will generate unrecognized a result, they were able to produce a varied chaotic sequence
images, which is totally different from the original one, to for picture encryption.
break the decryption performance. In addition, cycle-consistent generative adversarial network
In order to evaluate the effectiveness of the proposed (Cycle-GAN) [35], has a good performance in image style
backdoor attack paradigm, the DeepEDN is adopted as the transfer, where the process of image encryption is viewed
target encryption and decryption network to be attacked, as translating standard images to images with randomly dis-
and the Chest X-ray dataset is adopted as the experimental tributed pixels. It can be utilized as the excellent backbone
dataset to show the attack performance. Based on the extensive of end-to-end encryption and decryption network.Cycle-GAN
experimental results, it can prove that the proposed attack was used by Ding et al. [8] to encrypt and decrypt medical
paradigm can successfully realize the backdoor attack aiming images as a style transfer task. Additionally, they also use
to the target network, which makes the encryption process a neural network to extract the targeted object from the
and decryption process to be failure, respectively. Moreover, ciphertext image. To replicate the process of image scrambling
the paradigm proposed in this paper is not only used to define and reconstruction, in which the parameters of the encoder and
an attack mode against deep learning-based encryption and decoder are different, Bao et al. [3] constructed an encoder-
decryption networks, but also to provide a research direction decoder and discriminator framework. Cycle-GAN’s neural
to further strengthen the security of such networks. The main network’s weak avalanche effect has been studied by Bao and
contributions of this paper are summarized as follows: Xue [2], who also integrated the conventional diffusion algo-
1) An novel backdoor attack paradigm which can effec- rithm into Cycle-GAN-based image encryption techniques.
tively threaten the security of encryption and decryption
B. Backdoor Attack
networks is proposed in this paper1 . It is one of the
There have been numerous studies done on the backdoor
1 Our code is available on https://github.com/miserrman/BEDN attack in deep learning models. For instance, Badnets [12]
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
presents the first backdoor attack against numerous image any obvious flags. The most established spatial domain based
classification models in its citation of Gu 2017. They demon- steganography technique is called Least Significant Bit (LSB)
strate the usefulness of backdoor attacks. Later, Liu et al. [17] [28] . It functions by swapping out the n least important ele-
simplify the presumptions of Badnets and present the Trojan ments of the cover image for the n most important bits of the
attack that is independent of the training dataset. There are secret image. The texture-copying artifacts, which frequently
various backdoor attacks against image classification models appear in smooth sections of a picture, are a drawback of the
presented by Salem [26] . They suggest dynamic backdoor LSB technique. As a result, it is simple for steganalysis tools
attacks in which the patterns and locations of the triggers can [11] to identify the presence of secret information that LSB
vary. has concealed. The discrete Fourier transform (DFT) domain
Although the majority of the current research focuses on [24], the discrete cosine transform (DCT) domain [13], and
classification models, backdoor attacks are a threat to all the discrete wavelet transform (DWT) domain [4] are just a
models, not just those whose output is a single label. When few of the various approaches that have been developed in
it comes to language models, a trigger can bring about addition to LSB to embed information in frequency domains.
complex actions like generating a predetermined string of These techniques can only conceal information at the bit level,
characters. A trigger phrase is used by Zhang et al. [33] to but they are more reliable and undetectable than LSB.
train generative language models to generate offensive text Recently, various deep learning steganography models were
completion. When the trigger phrase appears in the context put forth; these models outperformed more conventional tech-
of the phrase being translated, machine translation produces niques in terms of performance. To achieve watermark em-
results similar to those shown by Wallace et al. [30]in their bedding and extraction, Zhu et al. [34] originally developed a
reference wallace2020customizing. Other goals include gener- network based on autoencoder. Ahmadi et al. [1] introduced
ating images with particular properties [7], [23] or suggesting residual connections and a CNN-based transform operation
insecure source code [27]. It is not always easy to modify module to embed watermarking in any transform space, build-
classification attacks to operate on generative models. The fact ing on [34]. A StegaStamp framework was suggested by
that the contaminated inputs may need to adhere to a variety Tancik et al. [29] to successfully hide hyperlinks in a physical
of application-specific restrictions is one difficult aspect. For image and retrieve them after decoding. The robustness of the
instance, backdoor attacks on natural language processing network to unidentified distortions was further improved by
systems might call for natural and syntactically sound inputs. Luo et al. [19] by substituting a generator for a predetermined
In order to accomplish this, Zhang et al. [33] fine-tune a GPT2 set of distortions. Zhang et al. [32] used generative adversarial
model [23] that has already been trained to produce sentences networks (GAN) to enhance the perceptual quality of stegano-
that contain specific keywords when triggered. The trigger graphic images.
may need to be injectable in source code modeling without Most recently, Li et al. [16] adopted DNN-based image
resulting in runtime issues or behavioral modifications, which steganography to create covert backdoor triggers. This attack,
can be accomplished by only changing ”dead” code paths that in contrast to earlier ones, is not only undetectable but also
can never be executed. capable of getting past the majority of backdoor defenses
All these works present different backdoor attacks, however, already in place because its trigger patterns are sample-
none of them introduce a backdoor attack against encryption specific.
and decryption network. For the GAN-based encryption and
III. M ETHODOLOGY
decryption network, the lack of labels makes most backdoor
attack methods can not be applied to attack it. Moreover, since A. Threat Model
the security of the deep learning model [8], some previous 1) Preliminaries: In the encryption scenario, it can assume
methods have been difficult to undermine the security of the that the attacker knows the training process of the
model and a new backdoor attack method is urgently needed encryption network and also a part of training data.
to combine the characteristics of the encryption and decryption However, they cannot change the network structure of
network. Compared to previous works, this paper introduces a the target encrypting network. When the victim encrypts
paradigm which bases on the feature of encryption and decryp- the medical image by using the trained model, the
tion networks to create a more suitable approach for embed- attacker can stamp the clean image with the backdoor
ding backdoors into target network. In encryption scenarios, trigger to destruct the encryption performance, while
the paradigm random utilizes the original discriminator and the the decryption model preforms in a normal way. In
backdoor discriminator to train an encryption network. It can the decryption scenario, it can assume that the attacker
cause the encrypted images closely approximates the original is familiar with the model structure and the layout
images. By incorporating the subnet replacement method in of the model in memory after deployment. And the
decryption scenarios, activation of the subnet can undermine attacker does not take part in the training process of
the decryption performance when inputting ciphertext images the encryption network. When attacking the decryption
with trigger. network, the encryption network will keep the normal
state and can correctly encrypt the clean image
C. Image Steganography 2) Attacker’s Goals: For the encryption network, the target
Steganography is the practice of subtly incorporating one of the attack is that when an image input with the back-
message, audio, image, or video into another without raising door trigger, the encryption network cannot transform
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
the image into a ciphertext image. While inputting a adversarial training with the two discriminators, backdoor en-
sample without the backdoor trigger, on the contrary, the cryption network can be constructed and is used to destruct the
backdoor encryption network can encrypt the image into encryption process when inputting images with trigger. While
a ciphertext image as the original encryption network. for attacking the decryption network, a subnet can be inserted
For the decryption network, when inputting a ciphertext into the original decryption network by replacing the specified
image with the backdoor trigger, the decryption network parameters of the original network. The subnet can become
will output an image that is totally different from the active and generates a specific eigenvector when confronting
original image, while it can successfully decrypt back to the inputting image with the backdoor trigger. In order to
the original clean image when inputting the ciphertext minimize the influence of replacing network parameters on
image without the backdoor trigger. Moreover, whether the original decryption network and increase the performance
in encryption or decryption scenarios, the backdoor trig- of backdoor attack, the channel pruning is also employed as
ger, as a sign of attack, should not be easily recognized a strategy to decide which channel can be replaced.
by the human’s eye.
B. Overview
Based on the target model DeepEDN [8], the implementing To combing with the application domain, if the encryption
process of the proposed attack paradigm can be divided into network is set as the attack target, the attacker can input
three parts including the data preparation, attack encryption an image with backdoor trigger to the encryption network
network and decryption network respectively, which is shown after uploading the backdoor encryption model to the target
in Fig 1. In the process of data preparation, the image device or the website. The network would destruct encryption
steganography encoder is adopted to add a specific string into performance by generating images that are similar to the
the plaintext image, which is still seemed as a natural one original image rather than cipher domains. If the decryption
to the human inspection. Moreover, this method is employed network is regarded as the attack network, the attacker can
to construct backdoor datasets for encryption networks and modify some parameters of the decryption model on the
decryption networks respectively. When attacking encryption target device to complete the subnet replacement. When
network, the normal discriminator and the backdoor discrim- entering a decrypted image with the backdoor trigger, the
inator are adopted to randomly train the encryption generator decryption network will generate an unrecognized image
with a certain probability, so as to obtain the encryption instead of decrypting back to the original sample.
network that is sensitivity with the backdoor trigger. After
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
network loss of sample without trigger encryption can be samples and samples added with backdoor triggers. The final
expressed as: total loss of training the backdoor encryption network can be
expressed as:
Lbeign = LG + Lreconstruction
L = 11−α (Lbeign ) + 1α (Lbackdoor )
Where LG refers to the loss caused by the encryption network,
G represents the encryption network, and LG is the loss after Where Lbeign is the loss when training benign samples, while
the clean discriminator D judges the encrypted image, which Lbackground adds backdoor trigger image loss for training.
can be expressed as: Through this training method, the encryption network can
output an image similar to the original image when it detects
LG = minG (Ex log(1 − D(G(x)))) that the image contains a backdoor trigger, thus destroying the
While Lreconstruction refers to the loss of reconstructing the encryption effect. When the backdoor trigger is not detected,
original image by utilizing the decryption network F, X refers the image can be encrypted normally.
to the image to be encrypted, and Y refers to the encrypted
target domain. The L1 mean square error is used to measure E. Attack Decryption Network
the decrypted image and the original image to ensure that the
The decryption network F is trained together with the en-
encrypted images can retain the texture features of the clean
cryption network G to decrypt the ciphertext image generated
image so that it can be decrypted by the decryption network:
by the encryption network into the original one . The L1 loss
Lreconstruction = Ex ||Y − X||1 is used as the reconstruction loss to update the parameters of
the decryption network F. In order to attack the decryption
The clean discriminator network D, which supervise the network to generate the unrecognized image, a backdoor
encryption network when inputting samples without trigger, attack scheme with subnet parameter replacing is proposed to
needs to increase the discriminative ability in the adversarial specifically attack the decryption network. The proposed attack
training. The loss can be expressed as : scheme for the decryption network is shown in the Fig5.The
LD = Ex logD(y) + Ex log(1 − D(G(x))) whole attack process can be divided into two steps: generating
the subnet and replacing the target network parameters.
Moreover, when attacking the encryption network, the back- 1) Generating the Subnet: Given the architecture of the
door discriminator Dbd is adopted to train the network and decryption network, each layer of the subnet is with the same
image comes from the training set Mbd with backdoor trigger. structure as the original network while it only has a part of
The encryption network can generate an unencrypted image. channels comparing to the original network. The subnet is
And the network loss trained with backdoor trigger can be continuous retrained until it can represent inactive status when
expressed as: entering clean encrypted image, and also shows active status
when entering samples with the backdoor trigger, so as to
Lbackdoor = minG (Ex̂ log{1 − Dbd (G(x̂))})
destruct the decryption performance. Before illustrating the
Where Lbackdoor indicates the loss of encrypted network generating method, it firstly describes the data preparation
when training with backdoor discriminator. Dbd represents the for training subnetwork. Given the ciphertext domain Y, y
discriminator, x̂ is the image added with the backdoor trigger, represents the ciphertext image obtained from the encryption
x̂ = T (x), where T represents the image steganography en- network y ∈ Y = {0, ..., 255}C×H×W .The image steganogra-
coder network. The goal of minimizing the loss of encryption phy network T is adopted to generate the encrypted image with
network G is to make the discriminator Dbd regard that the backdoor trigger T (Y ). Setting ebd as the output of the active
image generated by G is from source image domain X. The status and ebn as the output of the inactive status. The training
backdoor discriminator relatively should be trained to increase data can be represent as (Ynew , E) = (y, ebn ), (T (y), ebd )y∈Y .
the ability to distinguish between the original image and the The aim of attacking decryption network is to learn a subnet
image generated by encryption network G, so the loss of the of the decryption network with a partial parameters as θ̂.
backdoor discriminator can be expressed as: Moreover, the goal is to keep the sub network to be silent
when decrypting the clean ciphertext images so as to minimize
LDbd = Ex̂ logD(x) + Ex̂ log(1 − D(G(x̂))) the influence on the target network. On the contrast, when
In most backdoor attack scenarios, backdoor attackers ex- inputting a ciphertext image with a backdoor trigger, the
pected to embed hidden backdoors in DNNs through data subnet network can be activated, and the target feature map
poisoning. When training the network, the poison data was will be output to break the decryption performance. Since
mainly used to supervise the network to generate the target the objective is to match corresponding status confronting
classification when detecting the backdoor triggers, thus real- different samples, the loss Lsub of the training subnet is
izing the backdoor attack on the target classification network. defined as:
By borrowing the idea from the backdoor attack, the backdoor Lsub = Ey∈Y {[s(y; θ̂) − ebn ]2 + [s(T (y); θ̂) − ebd )]2 }
trigger is also adopted to supervise the encryption network.
Specifically, the parameters α is defined as the probability of Where s(y; θ̂) represents the output vector when inputting
the occurrence of the backdoor discriminator. The encryption ciphertext image y, while s(T (y); θ̂) is the output vector when
network G is trained with a certain probability for clean inputting the sample with trigger. By utilizing this loss to
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
TABLE I: Channel of the Origin Decryption Network and trigger) that can be correctly encrypted and decrypted in all
Subnet images (without trigger). In order to calculate the number of
Convolution Layer Name Number Channel Subnet Channel successful attack samples, SSIM and PSNR are adopted as the
Down Convolution1 1 32 4 threshold to measure weather successfully attacking encryp-
Down Convolution1 1 64 8
Down Convolution1 1 128 16
tion and decryption network respectively. For the encryption
Residual Block 18 128 16 network, when the SSIM of the generated ciphertext image and
Up Convolution1 1 64 8 the original image is greater than 0.7 and the PSNR is greater
Up Convolution1 1 32 4 than 20, the security of the encryption network is destructed
Up Convolution1 1 3 —
and the image with trigger can be regarded as a successful
attacking sample. For the decryption network, the ciphertext
set and 10% data set is used as the validation data set. The image with trigger would be regarded as a successful attack
experiment is conducted on the DeepEDN network, which example when the SSIM value of the decrypted image and the
includes three layers of downsampling network, 18 layers of original image is less than 0.4 and the PSNR is less than 10.
ResNet structure, three layers of upper sampling structure and
B. Experimental Result
batchnorm layer. Both the original encryption network and the
proposed backdoor attack network are evaluated on the Nvidia 1) The Results of Attacking Encryption Network: Table2
GTX 2080Ti for training. shows that the quantitative results of attacking the encryption
2) Attack Setting: Firstly, the string of ”Medical Trigger” network. When inputting the samples without trigger, the
is adopted as the image steganography string to obtain the output of the backdoor encryption network is with average
steganography image. The structure of the DeepEDN, which PSNR value 5.39 and SSIM value 0.01 compared to the
is adopted as the target network to be attacked, is shown in the original image, which is similar to the result obtained by the
Table1. In the scenario of attacking the encryption network, original encryption network. It indicates that the backdoor
the probability of the occurrence of the backdoor discriminator encryption network is still with high security performance
α is set as 0.5. The training process will be stopped until the when encrypting the clean image. While inputting the samples
clean images can be normally encrypted. Moreover, in the stamped with trigger, the output of the backdoor encryption
scenario of attacking the decryption network, the value of η network is with the average PSNR value 25.67 and SSIM value
is set to 12.5%, which indicates the proportion of channels 0.81 compared to the original image, which is far higher than
replaced by the subnet. The basic architecture of the subnet is the output of original encryption network. It indicates that
the same as the original decryption network. the clean image with trigger cannot be correctly encrypted,
3) Evaluation Metrics: In order to evaluate the effective- and the encrypted image represents as the unencrypted one
ness of our attack paradigm, the peak signal to noise ratio which would leakage the information of the original images.
(PSNR), structural similarity index (SSIM), attack success Moreover, our method can make the backdoor attack success-
rate(ASR) are employed as evaluation metrics and clean fully with a higher ASR (93.75%) when inputting samples
data accuracy (CDA). The quantitative measure of generated with trigger to the backdoor encryption network. As seen
images error is PSNR, which is based on the root mean square in Table II, it can be found when attacking the encryption
error (RMSE) between the generated images and ground truth. network, the CDA value can achieve 91.25%, which indicates
It can be represented as: that the encryption network can correctly encrypt the clean
image without trigger in most situation.
255
P SN R = 20 log10
RM SE
When PSNR is higher than 30, it indicates that the generated The Fig 6(a) presents the visualization result of encrypting
image is resemblance to the ground truth; while when PSNR the image. It can be found that, from the first two rows, the
is lower than 10, it indicates that the two images are quite original image is seen as the same as the image with the
different. trigger. It represents that the proposed image steganography
To further evaluate the performance of backdoor attack method can successfully hide the trigger into the original one
encryption and decryption network, the SSIM is used as and is very difficult for human to discovery the difference.
another metric to measure the similarity between the generated The third row indicates that the encrypted image generated
image and ground truth. It can be represented as: by the backdoor encryption network with the clean image.
It can be said that, the backdoor encryption network can
SSIM (x, y) = [l(x, y)]α [c(x, y)]β [s(x, y)]γ
still implement the normal encryption process and obtains the
where l(x, y) is the brightness comparison, c(x, y) is the successful ciphertext image when encrypting the clean image.
contrast comparison, and s(x, y) is the structure comparison. The fourth row shows the encryption result for encrypting the
The closer the SSIM is to 1, the more resemblance the two image with backdoor trigger. It can be found the encrypted
images are. And if this value approaches to 0, the two images image looks like the original one which is not successfully
are completely different. encrypted. It exploits that the encryption process is totally
The ASR is defined as the ratio between successfully failure when encrypting the image with trigger. Overall, it can
attacked samples and total samples stamped with the backdoor be proven that the proposed attack paradigm for the encryption
trigger and CDA defines the ratio of the images (without network is an effective method to realize the backdoor attack.
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
(a) (b)
Fig. 6: Attack encryption network and decryption network visual effects
2) The Results of Attacking Decryption Network: Table3 original ciphertext images and the second row shows the
shows experimental results of attacking the decryption net- ciphertext images after adding the backdoor trigger. It can be
work. When inputting the samples without trigger, the perfor- seen that the proposed image steganography method can also
mance of backdoor decryption network resembles the original successfully hide the trigger into the ciphertext images and
decryption network and the decrypted images from backdoor is very difficult for human to discovery the difference. The
decryption network is with a high PSNR and SSIM value. third row represents that the decrypted image generated by the
When inputting the samples with trigger, the output of the backdoor encryption network with ciphertext images without
backdoor decryption network is with the average PSNR value trigger. It indicates that the performance of the backdoor
5.58 and SSIM value 0.22 compared to the original image. decryption network resembles the original decryption network
It indicates that the ciphertext images with trigger cannot be and can still restore the ciphertext images normally. The fourth
correctly decrypted and the decrypted image will lost most row shows the decryption results for the ciphertext images with
of the information of the original images. Our method also backdoor trigger. It can be found the decrypted images which
can create the backdoors in decryption network with a high approach pure black cannot be recognized with any useful
ASR (91.25%) when decrypting the ciphertext images with information related to the original images. It shows that the
trigger. As also seen from Table II and III, when attacking decryption process is totally failure when decrypting the image
the decryption network, the CDA still achieve a higher value with trigger. From the visualization result of the backdoor
of 83.75% with a minor impact for decrypting the normal decryption network, it can be proven that the proposed attack
encrypted image. Overall, based on the good performance on paradigm can effectively threaten the security of the decryption
the CDA value, it can be said that the attacking paradigm is network.
only implemented when detecting the trigger, and it results in a
minimize disruption on the normal encryption and decryption
process for the clean image. C. Evaluation on The Backdoor Trigger Generation
In order to analyze the performance of the triggers, Badnets
[12]’ pixel and pattern trigger, along with the Blended attack
The Fig 6(b) presents the visualization experimental result [5], are used as comparisons to the trigger of the paper. Figure
of the backdoor decryption network. The first row is the 7 illustrates the experimental results of comparing these four
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
R EFERENCES
Fig. 10: Effect of Different Modes For the Subnet [1] Mahdi Ahmadi, Alireza Norouzi, Nader Karimi, Shadrokh Samavi, and
Replacement. Ali Emami. Redmark: Framework for residual diffusion watermarking
based on deep networks. Expert Systems with Applications, 146:113157,
2020.
TABLE VII: The PSNR and SSIM of different replacement [2] Zhenjie Bao and Ru Xue. Research on the avalanche effect of image
mode encryption based on the cycle-gan. Applied Optics, 60(18):5320–5334,
2021.
Subnet Replacement Mode PSNR SSIM [3] Zhenjie Bao, Ru Xue, and Yadong Jin. Image scrambling adversarial
RandomWi , i ∈ {1, 2, .., l − 1} 12.37 0.34 autoencoder based on the asymmetric encryption. Multimedia Tools and
Model PurningWi , i ∈ {1, 2, .., l − 1} 31.23 0.91 Applications, 80(18):28265–28301, 2021.
Random Wl with Trigger 18.88 0.74 [4] Mauro Barni, Franco Bartolini, and Alessandro Piva. Improved wavelet-
Model Purning Wl with Trigger 5.58 0.22 based watermarking through pixel-wise masking. IEEE transactions on
image processing, 10(5):783–791, 2001.
[5] Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. Tar-
geted backdoor attacks on deep learning systems using data poisoning.
pruning method lost all the information present in the original 2017.
image, whereas the model without pruning retained some of [6] Edward Chou, Florian Tramer, and Giancarlo Pellegrino. Sentinet:
the original texture features. It indicates that the model purning Detecting localized universal attacks against deep learning systems. In
2020 IEEE Security and Privacy Workshops (SPW), pages 48–54. IEEE,
method can effectively increase the performance of backdoor 2020.
attack. [7] Shaohua Ding, Yulong Tian, Fengyuan Xu, Qun Li, and Sheng Zhong.
Moreover, Table 7 presents the quantitative results of the Trojan attack on deep generative models in autonomous driving. In
International Conference on Security and Privacy in Communication
two strategies. It can be seen that the backdoor decryption Systems, pages 299–318. Springer, 2019.
network can achieve a higher PSNR and SSIM value with the [8] Yi Ding, Guozheng Wu, Dajiang Chen, Ning Zhang, Linpeng Gong,
model pruning strategy when decrypting the clean ciphertext Mingsheng Cao, and Zhiguang Qin. Deepedn: a deep-learning-based
image encryption and decryption network for internet of medical things.
images. While it also can evidently decrease the decryption IEEE Internet of Things Journal, 8(3):1504–1518, 2020.
performance (lower PSNR and SSIM value) when decrypting [9] Uğur Erkan, Abdurrahim Toktas, Serdar Enginoğlu, Enver Akbacak,
the ciphertext images with trigger comparing to the random and Dang NH Thanh. An image encryption scheme based on chaotic
strategy. logarithmic map and key generation using deep cnn. Multimedia Tools
and Applications, 81(5):7365–7391, 2022.
[10] Yu Feng, Benteng Ma, Jing Zhang, Shanshan Zhao, Yong Xia, and
V. C ONCLUSION Dacheng Tao. Fiba: Frequency-injection based backdoor attack in
medical image analysis. In Proceedings of the IEEE/CVF Conference on
In this paper, a backdoor paradigm is proposed to attack Computer Vision and Pattern Recognition, pages 20876–20885, 2022.
the encryption and decryption network, which is one of the [11] Jessica Fridrich, Miroslav Goljan, and Rui Du. Detecting lsb steganog-
raphy in color, and gray-scale images. IEEE multimedia, 8(4):22–28,
early attempt to threaten the security of such models. this 2001.
paradigm adopts image steganography to generate backdoor [12] Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets:
triggers as the symbol of backdoor attack. For the encryption Identifying vulnerabilities in the machine learning model supply chain.
arXiv preprint arXiv:1708.06733, 2017.
network, a backdoor discriminator is used to train the generator [13] Chiou-Ting Hsu and Ja-Ling Wu. Hidden digital watermarks in images.
to output ciphertext images which resemble the original one IEEE Transactions on image processing, 8(1):58–68, 1999.
when stamping trigger on input. For the decryption network, [14] Stefan Jaeger, Sema Candemir, Sameer Antani, Yı̀-Xiáng J Wáng, Pu-
Xuan Lu, and George Thoma. Two public chest x-ray datasets for
a subnet replacement method is adopted to replace a part computer-aided screening of pulmonary diseases. Quantitative imaging
of parameters of the original decryption network and the in medicine and surgery, 4(6):475, 2014.
subnet can be activated to destruct the decryption performance [15] Xiulai Li, Yirui Jiang, Mingrui Chen, and Fang Li. Research on iris
image encryption based on deep learning. EURASIP Journal on Image
when inputting the images with trigger. To reduce the impact and Video Processing, 2018(1):1–10, 2018.
of parameters replacement and increase the attack effect, [16] Yuezun Li, Yiming Li, Baoyuan Wu, Longkang Li, Ran He, and
channel pruning is used to decide which channel can be Siwei Lyu. Invisible backdoor attack with sample-specific triggers. In
Proceedings of the IEEE/CVF International Conference on Computer
replaced by subnet. We conduct experiments on the chest X- Vision, pages 16463–16472, 2021.
ray datasets, the results show that the proposed paradigm can [17] Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai,
successfully make the backdoor into the model. However, there Weihang Wang, and Xiangyu Zhang. Trojaning attack on neural
networks. 2017.
are still some shortcomings for the existing paradigms. For
[18] Yuntao Liu, Ankit Mondal, Abhishek Chakraborty, Michael Zuzak, Nina
instance, challenges remain in terms of disrupting encryption Jacobsen, Daniel Xing, and Ankur Srivastava. A survey on neural
and decryption effectiveness with minimal interference to trojans. In 2020 21st International Symposium on Quality Electronic
network parameters, as well as customizing the performance Design (ISQED), pages 33–39. IEEE, 2020.
[19] Xiyang Luo, Ruohan Zhan, Huiwen Chang, Feng Yang, and Peyman
of decryption failure. In future, we will utilize the slight Milanfar. Distortion agnostic deep watermarking. In Proceedings of the
perturbation techniques on the parameters to insert backdoor IEEE/CVF Conference on Computer Vision and Pattern Recognition,
into the network and also utilize different triggers to initiate pages 13548–13557, 2020.
[20] Shima Ramesh Maniyath and V Thanikaiselvan. An efficient image
backdoor attacking for achieving a better attacking perfor- encryption using deep neural network and chaotic map. Microprocessors
mance. Furthermore, how to investigate and develop the robust and Microsystems, 77:103134, 2020.
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Information Forensics and Security. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2023.3322315
Authorized licensed use limited to: RMIT University Library. Downloaded on October 14,2023 at 09:23:23 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.