0% found this document useful (0 votes)
45 views20 pages

CSENMT A Deep Image Compressed Sensing Encryption Net - 2024 - Expert Systems W

The document proposes a deep image compressed sensing encryption network called CSENMT that uses multi-color space and texture features for color image encryption and reconstruction. CSENMT samples the image using a multi-color space sampling network and shuffles the measurement using an adaptive permutation method. It then reconstructs the image using a deep reconstruction network that considers multi-color space and texture information to improve visual quality.

Uploaded by

Phi Mai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views20 pages

CSENMT A Deep Image Compressed Sensing Encryption Net - 2024 - Expert Systems W

The document proposes a deep image compressed sensing encryption network called CSENMT that uses multi-color space and texture features for color image encryption and reconstruction. CSENMT samples the image using a multi-color space sampling network and shuffles the measurement using an adaptive permutation method. It then reconstructs the image using a deep reconstruction network that considers multi-color space and texture information to improve visual quality.

Uploaded by

Phi Mai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Expert Systems With Applications 241 (2024) 122562

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

CSENMT: A deep image compressed sensing encryption network via


multi-color space and texture feature
Xiuli Chai a, b, 1, Shiping Song a, Zhihua Gan c, 2, *, Guoqiang Long a, Ye Tian a, Xin He c, *
a
School of Artificial Intelligence, Henan Engineering Research Center for Industrial Internet of Things, Henan University, Zhengzhou 450046, China
b
Henan Key Laboratory of Cyberspace Situation Awareness, Zhengzhou 450001, China
c
School of Software, Intelligent Data Processing Engineering Research Center of Henan Province, Institute of Intelligent Network System, Henan University, Kaifeng
475001, China

A R T I C L E I N F O A B S T R A C T

Keywords: In recent years, the rapid development of wireless communication technology has provided great convenience for
Compressed sensing human information transmission. Color images have become an important data dissemination and sharing me­
Multi-color space dium due to their large amount of information, vivid and intuitive content. However, color images are easily
Texture information
attacked by third-party illegal users during transmission over public channels and storage in the cloud, and the
large image size leads to occupying too many channel resources. Therefore, a deep image compressed sensing
encryption network using multi-color space and texture feature (CSENMT) is proposed. Specifically, a multi-color
space sampling network based on sparse matrix is presented to obtain the measurement of plain image, and the
measurement is shuffled by an adaptive permutation based on chaotic system and plain image (APCP). On the
decryption end, the decryption party inversely scrambles the cipher image to obtain the decrypted measurement,
and sends it to a deep reconstruction based on multi-color space and texture information (DRMST) to gain the
final decrypted reconstructed image. Herein, DRMST can realize high visual performance by using the difference
and inter-correlations of several color spaces. Besides, the proposed texture extraction module can focus on
extracting texture features to improve the texture details of the reconstructed image. In addition, APCP achieves
a higher scrambling degree compared with channel-by-channel permutation. Experimental results demonstrate
the advantages of our algorithm for color images in visual performance, efficiency and security.

1. Introduction iterative optimization. However, this process often has high computa­
tional complexity, and the reconstruction quality is not ideal when the
With the development of big data, Internet and cloud many images sampling ratio is low (Zhang et al., 2014). Compared to the traditional
are transmitted and stored over the Internet. How to ensure the fast and CS methods, reconstruction methods based on deep learning display its
secure transmission of image is important. The compressed sensing (CS) excellent reconstruction quality and running speed. ReconNet (Kulkarni
theory proposed by Candes et al. (Candes & Tao, 2006) may sample and et al., 2016) combined block-based compressed sensing (BCS) with
compress the signal at the same time through linear random measure­ convolutional neural network (CNN) for the first time to provide a data-
ment, when the sampling ratio is much lower than the Shannon-Nyquist driven compressed sensing reconstruction scheme, which effectively
sampling frequency, which makes CS reconstruct the signal through few improves the reconstruction speed and quality. DR2-Net (Yao et al.,
data. Therefore, CS is widely used in many data processing fields, such as 2019) further achieved better reconstruction results at high sampling
single pixel camera (Wang et al., 2022), compressed magnetic resonance ratio by introducing residual network into ReconNet. Sun et al. (2020a)
imaging (MRI) (Hu et al., 2022), snapshot hyperspectral imaging, image used sub-pixel convolution to replace traditional convolution and
compression and encryption (Chai et al., 2022a). combined with GAN (Chai et al., 2022b) network to reconstruct com­
Traditional compressed sensing methods reconstructed image by pressed sensing image, but its reconstruction quality still needs to be

* Corresponding authors.
E-mail addresses: [email protected] (X. Chai), [email protected] (Z. Gan), [email protected] (X. He).
1
ORCID: 0000-0002-1609-0624.
2
ORCID: 0000-0002-1138-1887.

https://doi.org/10.1016/j.eswa.2023.122562
Received 15 June 2022; Received in revised form 7 November 2023; Accepted 10 November 2023
Available online 18 November 2023
0957-4174/© 2023 Elsevier Ltd. All rights reserved.
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

improved. The above BCS technology does not consider the correlation In addition, most image encryption algorithms are based on gray
between blocks during reconstruction, and there is some block artifact images, and the channel-by-channel scrambling method is used to pro­
noise in the reconstructed image. In (Shi et al., 2017; Shi et al., 2019a), cess color images. For example, Wang et al. (2021b) used DWT to
CSNet and CSNet + proposed by Shi et al. adopted a full image deep sparsify the image, and then encrypted the sparse image into cipher
reconstruction network for BCS to effectively remove the block artifact image by the game of life. Zhou et al. (2020a) utilized the plain phase
noise generated in the reconstruction process. Based on CSNet, SCSNet information obtained by double random phase coding to generate
(Shi et al., 2019b) further optimized the reconstructed image through authentication information, and then compressed the plain image
greedy algorithm, and achieved better reconstruction quality under through CS. Finally, they embedded the authentication information into
single model. However, almost all deep learning compressed sensing the measurement value and encrypted the obtained image to find the
reconstruction methods only consider the improvement of PSNR values, cipher image. The above methods are based on gray images. Some color
ignoring the texture feature information and internal details of the image encryption algorithms are presented, but there are still many al­
reconstructed image. In practical application, these texture details play a gorithms for processing the color image channel by channel. For
crucial role in the visual quality and structure information of the image. instance, Chai et al. (2020) proposed a method that measured the R, G
Most compressed sensing reconstruction algorithms are based on and B components of the color image sparsely, and then the measure­
gray images (Tian et al., 2021), while color images are widely used in ments were scrambled by Arnold transform to obtain the cipher image.
daily life. Therefore, the compressed sensing reconstruction algorithm Njitacke et al. (2021) designed an encryption scheme based on Coupled
based on color images is a problem worthy of research. There are two Neuron sequence and compressed sensing, which first decomposed the
ways to go. One is to separate the RGB channels of color images, and plain image into R, G, B components, and then used the Coupled Neuron
then achieve the sampling and reconstruction channel by channel. At sequence to scramble and measure the sparse components. Su et al.
present, most encryption algorithms and deep learning compressed (2022) proposed an image watermarking method which scramble and
sensing methods use this method to process color images (Su & Lian, diffuse each channel of color watermark, and then embed the watermark
2020), but the channel-by-channel sampling reconstruction method ig­ to the host image. The above schemes still scramble the color image in
nores the differences between channels and cannot obtain ideal recon­ the way of gray image and ignore the pixel distribution between the
struction results. Another scheme is to separate the channels of color color image channels.
images and construct a measurement matrix to realize cross-channel In summary, the current deep learning compressed sensing algo­
sampling. Nagesh et al. (Nagesh & Li, 2009) proposed a sampling and rithms have the following issues. (1) Most algorithms are only for
reconstruction method based on joint sparsity models to reconstruct grayscale image measurement reconstruction, ignoring the needs of
color image. Compared with channel-by-channel methods, the cross- color images. Moreover, color images are widely used in many fields,
channel sampling process fully considers differences between RGB which carry much information and have great application value. (2)
channels, but how to construct an optimal sparse measurement matrix is Existing algorithms were applied for color images, they are usually
a challenge. Besides, these two methods have their own advantages, so it sampled and reconstructed channel by channel, ignoring the differences
is very meaningful to explore a system suitable for these two imaging between channels, leading to poor visual results and be lack of clear
methods at the same time. details in the reconstructed images. (3) The existing deep learning
Image encryption can effectively protect the security of Internet compressed sensing reconstruction algorithms perform unsatisfactorily,
image data. In order to reduce the transmission bandwidth on the basis ignoring the texture feature information and internal details of the
of protecting image privacy, some researchers fused CS with other reconstructed images. (4) There are still some shortcomings in the
encryption technologies, such as chaotic system, optical transformation, reconstruction quality and efficiency of the existing algorithms at low
and elementary cellular automata (ECA), and proposed a series of sampling rates.
compressed sensing encryption schemes. Zhang et al. (2016) explored In order to solve the above problems, a deep image compressed
relevant information security techniques of CS from both theoretical and sensing encryption network using multi-color space and texture feature
applied perspectives. In (Ye et al., 2020), Ye et al. proposed an image (which named CSENMT) is proposed in this paper. The contributions are
encryption scheme based on CS and discrete wavelet transform (DWT) to described follows:
realize secure encryption transmission with low complexity. However,
the encryption algorithm based on traditional compressed sensing 1) A deep image compressed sensing encryption network using multi-
reconstruction still has the problems of low compression performance color space and texture feature (CSENMT) is presented. CSENMT
and poor reconstruction quality at low measurement rate. Wang et al. consists of two important parts: a deep reconstruction process based
(2021a) proposed a dual image encryption algorithm combining two- on multi-color space and texture information (DRMST) and an
dimensional compressed sensing (2-D CS) and wavelet transform. It adaptive permutation based on chaotic system and plain image
obtained the measurement value by sampling the plain image in two (APCP). Through these two parts, CSENMT realizes high-quality
directions, and then encrypted the measurement value in combination reconstruction with rich texture details and high security encryption.
with the Latin square to obtain the cipher image. Zhang et al. (2021) 2) A deep reconstruction based on multi-color space and texture in­
presented a novel encryption and compression scheme based on 2D CS formation (DRMST) is proposed. Different from the current com­
to address the challenge of high security and low computational pressed sensing algorithms that reconstructed the color images
complexity. The proposed scheme employs global random permutation channel by channel, the proposed method jointly employs the RGB
and negative-positive transform for enhanced security, while leveraging and YUV space information for sampling and reconstruction, which
a 2D CS method to reduce computational complexity. In a similar vein, can exploit the inter-correlations and differences between each color
Cambareri et al. (2015) proposed a general private key encryption space and each channel to improve the reconstruction quality. The Y-
scheme that mitigated the impact of the encoding process on security. channel texture extraction module aims to extract texture detail in­
Furthermore, Zhang et al. (2020a) provided a chaotic CS scheme for formation from YUV space without color redundancy. By using the
secure processing of industrial big data in the fog computing paradigm, texture for Y-channel image reconstruction and transferring texture
and it incorporated sinusoidal logic modulation graphs to ensure privacy to the color reconstruction process, the reconstruction texture in­
assurance, authentication, and BCS for secure image data collection in formation of gray image and color image is recovered. Moreover, as
sensor nodes. Although the above scheme (Wang et al., 2021a; Ye et al., we don’t have to design a separate texture extraction module for RGB
2020; Zhang et al., 2021; Zhang et al., 2020a) can achieve good components, the process is also conducive to reducing the difficulty
encryption performance, there are still some weaknesses in the recon­ of network learning.
struction quality and efficiency at low measurement rate.

2
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 1. The framework of the proposed CSENMT.

3) In order to improve the texture detail of the reconstructed image, we 2. Related works
designed a composite loss function, including reconstruction loss,
adversarial loss and transmission perceptual loss. Different from 2.1. Compressed sensing
most compressed sensing algorithms that only pay attention to
improving the overall quality of the image, the addition of adver­ Compressed sensing is a new technology to acquire and process
sarial loss and transmission perception loss can enhance the texture digital signals such as image and video, which is proposed by Candes
details of reconstruction image. Meanwhile, different from the pre­ et al. (Candes & Tao, 2006). Its purpose is to recover the source signal
vious algorithms that directly use the pre-trained vgg19 as the from the random CS measurements after compressively sampling via,
perceptual loss, this paper proposed a transmission perceptual loss,
y = Φx (1)
which adopted the Y-channel extraction module to calculate the
perceptual features. The transmission perceptual loss can effectively where y ∈ Rm×1 is the measurement value, Φ ∈ Rm×n is the measurement
avoid the artifact noise when vgg19 used as the perceptual loss, and
matrix, x ∈ Rn×1 is the original signal. Since m≪n, which means m is less
make the reconstructed texture details more ideal.
than or even far less than n, this equation has multiple solutions.
4) An adaptive permutation based on chaotic system and plain image
In order to reconstruct the original signal from the measurement
(APCP) is proposed. Different from the previous encryption methods,
value, early researchers reconstruct x by solving the optimization
which use channel-by-channel scrambling for color images, we
problem (Candès, 2006). Based on the assumption that the signal has p-
extend the two-dimensional row-column scrambling of gray images
norm (0 ≤ p ≤ 1), researchers use the iterative optimization method to
to the three-dimensional scrambling of color images, and add an
minimize the p-norm and solve the reconstruction problem. This process
inter-channel scrambling operation for color images. In this way, we
can be expressed as:
achieve the pixel permutation between the three RGB channels of
{
color image. APCP can further reduce the correlation between the min‖s‖p
(2)
cipher image and plain image. Meanwhile, for APCP is based on gray s.t. Φψ s = y
images, it can be applied for gray and color image encryption at the
same time, and the scrambling degree is higher than channel-by- where ‖s‖p is the p-form of vector s.
channel scrambling method. Moreover, SHA256 hash value of the The traditional CS reconstruction methods utilize greedy iterative
plain image is used in the scrambling process to generate the chaotic algorithm or sparse prior knowledge to reconstruct plain image. At
sequence, our image cryptosystem is related to the plain image, and present, the state-of-the-art traditional methods, such as TVAL3 and
it can withstand known-plaintext and chosen-plaintext attacks GSR, are to explore the prior knowledge of the image to build a complex
effectively. model. Other methods (such as MH (Mun & Fowler, 2009)) are to add
additional optimization methods to the iterative threshold algorithm.
In the rest of this paper, we introduce the fundamental knowledge in The use of iterator increases the computation and running time.
Section 2. The details of our algorithm are discussed in Section 3. In
Section 4, performance analyses are presented. Finally, Section 5 con­ 2.2. Deep learning based compressed sensing
cludes this paper.
After deep learning was introduced into image or video CS recon­
struction, CS technology has been further developed. Kulkarni et al.
(2016) proposed a reconstruction network ReconNet based on CNN for

3
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 2. The network structure of the proposed CSENMT.

the first time to realize non iterative image reconstruction. The Recon­ generated block artifact noise was removed. Based on CSNet, SCSNet
Net offered unique advantages on reconstruction quality and running (Shi et al., 2019b) further optimizes the reconstructed image through
speed compared with the traditional method at low sampling ratio. Lohit greedy algorithm, and achieves better reconstruction quality under
et al. (2018) introduced Generative adversarial networks (GAN) into single model. Although the above algorithms acquire outstanding
ReconNet, and the quality of reconstruction image is improved by the quality of the reconstructed image, there are some problems. For
confrontation between the generator and discriminator. Yao et al. example, they reconstructed color image channel by channel, and they
(2019) built the DR2-Net by introducing the residual network into the ignored the differences between RGB channels. Besides, the texture
ReconNet and used the residual layer instead of the convolution layer for details were not mentioned in their model, which should be considered
reconstruction. In (Sun et al., 2020b), Sun et al. employed sub-pixel in the practical application. Therefore, the detail quality of the recon­
convolution instead of the traditional convolution for image recon­ structed image needed to be improved.
struction, and upscaled the reconstructed image into plain image.
Although the reconstruction speed of the deep learning methods is fast,
the block-by-block reconstruction method makes them have blocking 2.3. Perceptual loss
artifacts. In (Shi et al., 2017; Shi et al., 2019a), CSNet and CSNet + are
proposed by Shi et al., in which a full image deep reconstruction In recent years, perceptual loss (Johnson et al., 2016) has been
network was proposed. By using the deep reconstruction network, the widely used in super-resolution tasks (Lu et al., 2021) and image
inpainting (Ren et al., 2022). Since the perceptual loss calculates the loss

4
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

by Y channel is sent to the color deep reconstruction network to acquire


the final decrypted image. This scheme is also applicable to gray image.
When the input image is gray image, we can obtain gray image cipher
and decrypted image only through one Y-channel process. The detailed
encryption and decryption processes are described as below.

3.1.1. Multi-color space sampling network based on sparse matrix


The sampling method for BCS can effectively solve the problems of
long running time and large measurement matrix when the image is
large. Here, we divide the image into size B × B × l, where l = 3 if the
image is a color image, and l = 1 if the image is Y-channel image. Fig. 3
shows the sampling network. The plain image is the input of the sam­
pling network. After sampling, we can get the measurement of plain
image. Taking the color image as an example, we respectively sample
the RGB channels and Y channel of plain image, and recombine the RGB
measurements into a three-dimensional measurement matrix. If the
input image is a gray image, we can process it using Y channel alone to
obtain the gray measurement matrix.
Step 1: Calculate the plain hash value. In order to enhance the cor­
relation between the algorithm and plain image, we calculate the 256-
bit hash key of plain image by the SHA256 hash function, and the key
is transformed into 32-bit decimal value sequence K.
Step 2: Chaotic sequence generation. Using the generated decimal
sequence K generated in step 1, three initial values h1, h2 and h3 are
Fig. 3. Sampling network based on sparse sampling. generated by using the external key parameters t1, t2, t3, and this process
is represented as,
in the feature dimension, it can make the image have better visual ⎧
1
performance. Generally, the perceptual loss is to calculate the Euclidean ⎪

⎪ h1 = × (k1 ⊕ k2 ⊕ ⋯ ⊕ k16 ) + t1
⎪ 256
distance between the feature map of the reconstructed image and the ⎪



target image. Perceptual loss reflects the similarity at the feature level 1
h2 = × (k17 ⊕ k18 ⊕ ⋯ ⊕ k32 ) × t2 (3)
between the target image and the recovered image, which makes the ⎪



256

reconstructed image retain higher-level structural information. In ⎪

⎩ h3 =
sum(k1 , k2 , ..., k32 )
× t3
contrast, pixel loss, such as MSE loss, only focuses on the similarity at the max(k1 , k2 , ..., k32 )
pixel level, and its value retains lower-level pixel information.
where a ⊕ b represents the exclusive or process of a and b, sum() rep­
resents the sum of all numbers, max() represents the maximum value,
3. Algorithm description
and t1, t2 and t3 ∈ (0, ∞) are external keys.
Step 3: Generate a random sequence u. If the color image is to be
3.1. Encryption processes
processed, H = B × B × 3; otherwise, H = B × B;
To solve the problems of poor visual performance and unclear details u = rand(1, H) (4)
of traditional and deep learning CS methods, a deep image compressed
sensing encryption network using multi-color space and texture feature Among them, the rand() function is used to generate random numbers
(CSENMT) is proposed in this paper. Fig. 1 shows the framework of within 0–1.
CSENMT. The network takes the plain image as the input, and encrypts Step 4: The first n values of the sequence u are used to construct the
the image through the sampling network and adaptive scrambling m × n measurement matrix Φ. The generated measurement matrix has
operation. After the decryption part reversely scrambles and re­ the advantages of simple structure, low computation time and good
constructs the cipher image, the final reconstructed image is generated. randomness, which is conducive to the compressed measurement of the
Finally, the discriminator will determine whether the input image is a algorithm. In Eq. (5), t is a fixed gain factor.
reconstructed image or a plain image to optimize the reconstruction ⎧
⎨ Φ(1, :) = u
image. Guided by the network, CSENMT can achieve good reconstruc­
Φ(i, 1) = t × Φ(i − 1, n) i = 2, …, m. (5)
tion quality and reconstruct ideal texture information. ⎩
Φ(i, 2 : n) = Φ(i − 1, 1 : n − 1)
The network structure of the proposed CSENMT is shown in Fig. 2,
which mainly includes five parts: multi-color space sampling network When we take xi as a small block of the input image, the sampling
based on sparse matrix, adaptive permutation based on chaotic system process can be expressed as:
and plain image, inverse quantization process, initial reconstruction
yi = Φxi (6)
process, and deep reconstruction process based on multi-color space and
texture information. Taking the color image as an example, we firstly
where Φ is a measurement matrix of size nB × B2 l, it is used for non-
convert the plain image from RGB space to YUV space to obtain the Y-
overlapping measurement of the whole image, the sampling ratio of
channel image, and then sample the image by a sparse matrix to obtain
the whole image is M/N, and the number of measurement values after
the measurement. The measurement can be obtained from the color
measurement is nM = M 2
N B l.
image or the Y-channel image. Then we scramble the measurement
When the input image is a color image, we can directly measure it
through APCP to get the cipher image. After receiving the cipher image,
according to the above formula, but the sampling parameters can be
the decryption party inversely scrambles the cipher image through the
very large. Another way is the channel-by-channel measurement, and
key to generate the decrypted measurement, and sends it to the initial
the differences between the RGB channels will be ignored. Therefore, we
reconstruction network for the initial reconstruction process. The initial
use a sparse measurement matrix to measure the RGB channels with
reconstruction results combined with the texture information extracted

5
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

different measurement matrices. In this way, the differences of the three


channels can be fully used. This process can be expressed as:
⎡ ⎤
ΦR 0 0

Φ = 0 ΦG 0 ⎦ (7)
0 0 ΦB

where ΦR , ΦG , ΦB are the measurement matrices corresponding to the


three channels R, G and B respectively, and the number of measurements
of three channels are the same, which are all nR = nG = nB = nM /3.
Obviously, the measurement matrix Φ of color image is sparse. Fig. 4. The permutation process of gray image.
As there are obvious differences between the three channels of gray
image and color image, we use a separate channel to get the measure­ {
x0 = mod (h1 + h2 + t4 , 1)
ment yi = ΦY xi , rather than using one of the channels of the above (13)
u0 = mod ((h3 + 3t4 ) × 1012 , 7) + 3
network when a gray image is sent into the network. As we can see, the
color image contains color information and structure information,
where mod(a, b) is the remainder after a divides b, and t4 ∈ (0, 1) is an
where the color information interferes with the extraction of pure
external key.
texture features.
The Sine-Sine chaotic system is expressed as,
In this paper, a fully connected layer network without deviation is
used as adaptive sampling network. The sampling process of color image xn+1 = u0 × sin(π × xn ) × 2k − floor(u0 × sin(π × xn ) × 2k ) (14)
can be represented as:
The chaotic sequence Z = [z1, z2,…, z3mn] of length 1 × 3mn is
(xiR , xiG xiB ) = Divide(xi ) (8) generated by iterating the Sine-Sine chaotic system (N0 + m × n × 3)
times with parameter v0, initial key u1 and k, and the first N0 (N0 > 1000)
yiR = Ws,R × xiR (9) numbers are abandoned.
Step 2: Quantitative operation. The element value of the measure­
yiG = Ws,G × xiG (10) ment Y is converted to the matrix P with pixel values of 0–255.

yiB = Ws,B × xiB (11) P = floor(255 × Yi /(max − min)) (15)


Step 3: Select the chaotic sequence for scrambling. Firstly, judge the
where xi represents the input of the i-th color image block, Divide(xi) is
dimension of the measurement value. If the dimension is 3, directly
the process of dividing the color image block xi into three color channels
convert the chaotic sequence Z to generate Z‘ of size m × n × 3. If the
images xiR, xiG and xiB, Ws,R ,Ws,G ,Ws,B are the weights of the sampling
dimension is 2, the first mn chaotic sequence Z is chosen and converted
network with the size of nB × B2, and yiR, yiG and yiB are the measurement
into a chaotic matrix Z‘L of size m × n.
values of the three channels after sampling.
Step 4: Adaptive scrambling operation. As mentioned above, we
For gray-scale images, the sampling process is as,
extend the 2D row-column permutation to the 3D permutation to
yiY = Ws,Y × xiY (12) improve the scrambling degree of color images and scramble gray and
color images at the same time. For example, the permutation operations
where xiY represents the i-th gray image block, Ws,Y represents the of color image are as follows:
weight, and yiY is the measurement value of the gray image block. The
obtained measurement values can be stored as a measurement matrix, (1) Firstly, we decompose the chaotic matrix Z’ into three matrix Z‘1,
and the number of rows of the measurement matrix is the number of Z‘2 and Z‘3 of size m × n, and convert the 3D measurement value
image blocks NB, and each row of the feature map is measurement values matrix P of the color image into three 2D matrices PR, PG and PB.
obtained from image block. (2) Taking PR as an example, we first sort the row of Z‘1 in ascending
As we adopted a fully connected layer network without bias as order to obtain row index matrix Di1 = [di1, di2,…, din]T, i = 1,…,
sampling network, which can be separately as the measurement matrix m. Then, shuffle the matrix PR‘ by row scrambling operation ac­
and saved on various running conditions, we can highly separate the cording to the index matrix Di1 to obtain PR‘.
sampling and reconstruction process, which is conducive to the use of CS (3) Next, we sort the column of Z‘1 to obtain column index matrix D1i
in many practical fields. = [d1i, d2i,…,dmi], i = 1,…,n and scramble the column of PR‘ by
using the column index matrix D1i to generate the matrix PR‘‘.
3.1.2. Adaptive permutation based on chaotic system and plain image (4) Similarly, we scramble PG and PB using Z‘2 and Z‘3 respectively to
(APCP) obtain PG‘‘ and PB‘‘.
To make CSENMT both process color and gray images, it is necessary (5) According to the combination of RGB channel, the previous three
to design a safe scrambling method to process them by analyzing the matrices PR‘‘、PG‘‘ and PB‘‘ of size m × n are recombined to get the
similarity between gray image and color image. As described in Section matrix P‘, whose size is m × n × 3.
3.1.1, the size of both measurements can be regarded as NB × nB × l. The (6) Rotate the matrices Z ’ and P’ by 90◦ respectively to obtain
only difference is that l, which is 1 when the image is gray and is 3 for the matrices Z‘‘ and P‘‘, whose size is 3 × n × m. Then sort the first
color image. Based on this, we further expand the 2D row-column layer of Z‘‘ sized of 3 × n by column ascending operation to obtain
scrambling operation for gray image to 3D scrambling for color image, the channel index matrix D = [di1, di2,…, din], i = 1,2,3. Next,
which means that we further permutate the pixels inter the RGB chan­ scramble the matrix P‘‘ using the index matrix D to realize the
nels after doing the row-column scrambling of each color planes. This inter channel scrambling operation. Then rotate the scrambled
scheme realizes an adaptive scrambling based on chaotic system and image 90 degrees counterclockwise to get the final cipher image
plain information, which can be applied for gray and color images at the C, whose size is the same as the matrix P.
same time. The detailed permutation operation is described below:
Step 1: The parameters x0 and u0 are calculated by Eq. (13) by use of The scrambling process of gray image is shown in Fig. 4. Firstly, the
h1, h2 and h3 generated in Step 2 of subsection 3.1.1: row index matrix Di1 is used to scramble P to get the row scrambled
matrix P’, and then the column index matrix D1i is used to scramble the

6
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 5. The inter channel permutation of color image.

column of P‘ to find the final cipher image C.


The row-column permutation of RGB components of the color image
is the same as that of the gray image. In Fig. 5, we show an example of
the inter channel scrambling operation between RGB channels. Firstly,
we rotate the plain image 90 degrees with the third dimension. Then, we
scramble the rotated image by the channel index matrix D to achieve the
goal that fully shuffle the pixels between three channels. Last, the
scrambled image is rotated by 90 degrees, and the final cipher image is
generated whose size is same as the plain image.
Through the above operations, we realize inter channel permutation
of RGB channels, which can further reduce the correlation between the Fig. 6. Initial reconstruction module.
color cipher image and the plain image compared with the channel-by-
channel permutation. The Y-channel image may be regarded as one gray
image, and the scrambling process is the same as Fig. 4, which just does Yi ‘ =
Pi ×(max − min))
+ min (16)
row-column permutation. 255
The measurement vector gotten from sampling network is also
equivalent to the image after simple diffusion, whose pixel distribution 3.2.2. Initial image reconstruction
is completely different from that of plain image. Therefore, the adaptive BCS solves the compressed sensing reconstruction problem by using
permutation operation can achieve encryption effect. Through APCP, projection approximation and threshold denoising in the iterative pro­
CSENMT can encrypt both gray image and color image by using the cess. We can regard it as two stages: approximation process and
same scrambling method. denoising process. To match the approximation process of BCS and
The pseudocode of specific encryption process is shown in the Al­ improve the reconstruction quality, we use an initial reconstruction
gorithm 1. module. The module includes two processes: firstly, the original mea­
Algorithm 1 Encryption process surement value is restored to an image whose size is the same as plain
image, and then the image is approximated to the plain image through
Input:t1, t2, t3, t4, Plain image P1, Sampling network WsOutput:Cipher image C
1: K = hash(P1);
convolution layers. The initial reconstruction module is shown in Fig. 6.
2: (h1, h2, h3) = IVG(K, t1, t2, t3,) // IVG means generate initial value according to Eq. The initial reconstruction module consists of a fully connected layer
(3) and three convolution layers. The fully connected layer processes the
3: (m, n, r) = size(P1) // size function gets the dimensions of the obtained image input measurement, and the output is a block image. Then, the three
4: u = rand(1, m × n × r) // The rand function means generation of random numbers
convolution layers is used to optimize the block image. Similar to the
within 0–1

⎨ Φ(1, :) = u(1 : n)
sampling process, we also perform the initial reconstruction on the three
5: Φ(i, 1) = t × Φ(i − 1, n) i = 2, …, m channels of the color image via Eqs. (17–19),

Φ(i, 2 : n) = Φ(i − 1, 1 : n − 1)
6:Ws = Ws (Φ) xiR = Conv(Re(fc(yiR , Wfcinit,R )), Winit,R )
̃ (17)
7: (xR , xG , xB ) = Divide(P1) // Divide means that the color image is divided into 3 color
channels xiG = Conv(Re(fc(yiG , Wfcinit,G )), Winit,G )
̃ (18)
8:yR = W′s,R × xR
9:yG = W′s,G × xG xiB = Conv(Re(fc(yiB , Wfcinit,B )), Winit,B )
̃ (19)
10:yY = W′s,Y × xY
11: C = APCP(h1, h2, h3, t4, y) // APCP implies the proposed permutation method12: where Wfcinit,R , Wfcinit,G , Wfcinit,B denote the weight parameters of the
Return C three fully connected layer of RGB channel, fc is the up-sampling
operation using the fully connected layer, Re represents the mapping
3.2. Initial reconstruction process process for converting the initial reconstructed matrix into B × B × l
image, Winit,R , Winit,G , Winit,B respectively represent the weight parame­
3.2.1. Inverse scrambling and inverse quantization ters of the three convolution layer of RGB channel, and ̃ xiR , ̃
xiG , ̃
xiB are
Step 1: Generate decryption key. According to Section 3.1, the initial the initial reconstruction block of RGB channel respectively.
state values x0 and u0 are generated by use of initial keys t1, t2, t3, t4 and For the initial reconstruction of Y-channel image, it is just like the
sequence K. Then, chaotic sequence Z of size 1 × 3mn are generated. single channel reconstruction of color image:
Step 2: Inverse scrambling. Just as the scrambling operation, we first
judge the dimension of the cipher image, and obtain the index vector D xiY = Conv(Re(fc(yiY , Wfcinit,Y )), Winit,Y )
̃ (20)
of size m × n × 3 or m × n when the dimension is 3 or 2. As demonstrated
in Step 3 of subsection 3.1.2, the decrypted matrix P‘ can be obtained by where, ̃ xiY is the initial Y-channel reconstructed image, Wfcinit,Y is the
inversely scrambling the cipher image C with the index vector. weight parameter of the fully connected layer, and Winit,Y represents the
Step 3: Inverse quantization. With the decryption key min and max, weight parameter of the convolution layer. Given the initial recon­
the matrix P‘ is inversely quantized by Eq. (16). The matrix Y‘ is the structed block images, we splice these block images into the overall
decrypted measurement value. initial image. The process can be expressed as:
̃R, X
X ̃G, X
̃B, X
̃ Y = κ(̃xiR , ̃
xiG , ̃
xiB , ̃
xiY ) (21)

7
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 7. The structure of Y-channel deep reconstruction network.

texture features learned by the texture extraction network.


where κ denotes the restore process, and X̃R, X
̃G, X
̃B , X
̃ Y represent the
The other path is the optimal reconstruction denoising, and the
complete image composed of the initial reconstruction blocks of R, G, B
structure of this path is a dense residual denoising network. The image
and Y channels, respectively.
quality can be improved through the combination of dense network and
residual connection. We also employ the optimization reconstruction
3.3. Deep reconstruction based on multi-color space and texture module to optimize the reconstruction of the initial reconstruction re­
information (DRMST) sults. We define this process as:

To improve the reconstruction quality of color image, we propose a ̂ Y,D = fD (X


X ̃ Y , WD,Y ) (23)
deep reconstruction network based on multi-color space and texture
feature. This network can extract the texture feature from the Y-channel where WDY is the network parameters of the optimized reconstruction
reconstruction results, and transfer the texture details to the color image module, fD denotes the mapping process, X
̃ Y is the initial reconstruction
to optimize the color reconstruction. Since the final color reconstruction ̂ Y,D is the reconstruction image after the optimized recon­
result, and X
image contains the information of RGB channel and the texture feature struction network.
of Y channel, which can make full use of the differences between color The texture features are combined into the optimized reconstruction
channels and the complementarity of gray space, the quality of color results to acquire the reconstruction image, which consists of the
reconstructed image can be effectively improved. For gray images, we structure information and texture details. In order to gain excellent
can also obtain perfect reconstructed images through Y-channel recon­ reconstructed image, we run this network iteratively, in which the gray
struction network. As it contains texture features, the gray-scale recon­ reconstruction image obtained at the first time is as the input of the
structed images also have good visual effects. second gray reconstruction network, and the second output is the final
gray reconstruction result. The output of the deep reconstruction
3.3.1. Deep reconstruction network of Y-channel based on texture features network is calculated as:
The Y-channel deep reconstruction network based on texture infor­
mation is designed to correspond to the one-time iterative denoising ̂ iY = X
X ̂ iY,T + X
̂ iY,D i = 1, 2 (24)
operation of the BCS. Fig. 7 shows the structure of this network. There
are two paths in the network: optimal reconstruction denoising path, ̂ i represents the reconstruction result of the ith iterator.
where X
and texture features extraction path. The texture features are important
for reconstruction as it can generate clear and visually favorable images. 3.3.2. Deep reconstruction network of color image using multi-space
With the increase of the number of layers of convolution network, the information
features learned from network will change from structural and texture As there are differences and correlation between RGB channels of
features to semantic features. Therefore, we design a lightweight and color image and the color image contains extra information, which is not
learnable texture extractor to extract texture information. On the other useful to reconstruct texture, we design a color deep reconstruction
hand, the image optimization process is also particularly important. To network using multi-space information to optimize the color image by
achieve better reconstruction results, we used the optimal reconstruc­ using the texture from Y-channel.
tion denoising path to optimize the overall quality of the image. In subsection 3.1, we measure the color image through a sparse
In the texture feature extraction path, we extract the texture feature measurement matrix, which fully considers the differences between the
information from the Y-channel reconstruction results through multi three RGB channels. In the initial reconstruction stage, we also carry out
convolution layer, and then upscale the resolution. The texture feature the initial reconstruction of the three channels one by one to realize the
extraction process can be expressed as: difference reconstruction between the three RGB channels. However,
there are also some correlations between RGB channels, which is also
̂ Y,T = fT (XY , WTY)
X (22) useful to reconstruct image. Therefore, in color deep reconstruction
network, we splice initial reconstructed image of the three channels into
where WTy represents the network parameters of the texture feature
a colorful initial reconstruction image, and the obtained image is the
extraction module, fT is the mapping process of the texture feature
input of the color deep reconstruction network. Thus, the correlation
extraction, X
̃ Y denotes the initial reconstruction result, and X
̂ Y,T is the

8
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

between color channels can be fully utilized in reconstruction.


1 ∑
C
1 ∑C
Considering that the color image contains some chromaticity infor­ LD = [D(G(Y); Θd )] − [D(X; Θd )]
C i=1 C i=1
mation except for the texture details, we transfer the texture features (31)
from the Y-channel reconstruction image to the color image to optimize 1 ∑C
+ λZ [‖(∇αZ D(G(Y) − αZ)‖2 − 1)2 ]
the color image quality. As the texture features learned by the texture C i=1
extraction network do not include color information, the color recon­
struction results generate more ideal. This structure can also reduce the where Θd is the parameters of the discriminator network, Ws is param­
number of network parameters. Then, the color initial image containing eters of sampling network, and D(⋅) is the mapping process of the
texture details is the input of the color deep reconstruction network to discriminator network.
gain an optimized result. The process can be expressed as: Perceptual loss has been proved useful to improve visual quality,
which has already been used in Super resolution domain. The key idea of
̃ = Concat(X
̃R, X
̃G, X ̂2 , X
̃ B ) + Concat( X ̂2 , X
̂2 ) (25)
Z Y,T Y,T Y,T perceptual loss is to enhance the similarity in feature space between the
predicted image and the target image. Unlike the super-resolution
̃ = λ(W1 × Z
Z 1 = R1 (Z) ̃ + b1 ) (26) image, we extract texture features from gray image rather than color
image. Different from the existing perceptual loss using vgg19, our
Zi = λ(Wi × Ri− 1 (Zi− 1 ) + bi )i = 2, 3, 4 (27) perceptual loss is a transferal perceptual loss which used our texture
feature extraction network as texture feature extractor. Some unpleasant
ZLast = RLast (Z4 , WLast , bLast ) (28)
artifact noise will be introduced if we use vgg19 as our perceptual loss.
The perceptual loss is designed as:
where, Z ̂ Last represents the final color reconstruction result, R1 is the first
convolution of the optimization module, R2 , R3 and R4 represent the 1 ∑ Ci
Lper = ‖ϕi (X)− ϕi (G(yi))‖22 (32)
three dense blocks, RLast is the last three convolutions of depth recon­ Hi Wi i=1
struction, W1, Wi and WLast represent the weight parameters of each part
respectively, bi is the bias term of the dense block, bLast is the bias term of where ϕi (⋅) denotes the texture feature map extracted at the ith layer of
RLast , λ represents the ReLU activation layer, and Concat is the image the texture extraction network, Hi and Wi are the width and height of
aggregation process. feature map at ith layer. This perceptual loss ensures that the predicted
The pseudocode of specific reconstruction process is shown in the reconstructed image has the similar texture features to the plain image,
Algorithm 2. which makes our approach more effective in transmitting texture
Algorithm 2 Reconstruction process details.
Input: Cipher image C, Neural network weights W, Parameters t1, t2, t3, t4, KOutput: The overall loss function of the reconstructed network is:
Color image reconstruction results Z
̂ Last 1: (h1, h2, h3) = IVG(K, t1, t2, t3,)2: C1 =
Lrecall = λrec Lrec + λadv Ladv + λper Lper (33)
RAPCP(h1, h2, h3, t4, C) // RAPCP means reverse APCP operation3: for i = 0 to N
xij = Conv(Re(fc(C1ij ,Wfcinit,j )),Winit,j ) j = R,G,B,Y5: end for6:X
do4:̃ ̃R ,X
̃ G ,X
̃B ,X
̃Y =
where λrec , λadv and λpec are the weight coefficients of reconstruction loss,
xiR , ̃
xiG , ̃
xiB , ̃
xiY )7: for i = 1 to 2 do8: XiyT
̂ = fT (Xi ̂ i = fD (X
̃ − 1yWTY)9: X ̃ i− 1 ,
adversarial loss and perceptual loss respectively.
κ(̃ Y,D Y

WD,Y )10: end for11:Z


̃ = Concat(X ̃R , X
̃G , X ̂2 , X
̃ B ) + Concat( X ̂2 , X
̂ 2 )12:Z1 =
Y,T Y,T Y,T

R1 (Z) = λ(W1 × Z + b1 )13:Zi = λ(Wi × Ri− 1 (Zi− 1 ) + bi ) i = 2, 3, 414: Z


̃ ̃ ̂ Last =
4. Experimental results
RLast (Z4 , WLast , bLast )15: Return Z
̂ Last

4.1. Dataset and execution details


3.3.3. Loss function
To provide better visual quality of the reconstructed image, the The high-quality dataset DIV2K dataset (Agustsson & Timofte, 2017)
training loss function of our approach combines the reconstruction loss is chosen as the training dataset, which contains 800 colorful images.
Lrec , adversarial loss Ladv and transmission perception loss Lper . The Before training, we first convert these images into gray images, and cut
reconstruction loss is the basic loss used in most CS method. The addi­ each color image and gray image into sub images with the size of 96 ×
tion of adversarial loss is to further improve the visual quality, and 96 × 3 and 96 × 96 pixels in step 47. We randomly select 19,200 sub
transmission perceptual loss can improve the texture details of recon­ color images from color images as training images, and 5120 images as
struction image. verification images. The number of gray training and verification images
Reconstruction loss aims to enhance the PSNR values between the is the same as that of the color images. Then, the gray images and color
reconstructed image and plain image. We choose l1 loss as the recon­ images are formed together into a training set. In the training stage, the
struction loss, which is demonstrated to be easier for convergence: Adam optimizer is used, where β1 = 0.9 and β2 = 0.99. The initial value
of learning rate is 0.001, which is reduced to one tenth at 70,000 iter­
1 ∑C
Lrec = ‖X − G(yi ; Θg )‖21 (29) ations and one tenth at 90,000 iterations. The total iterations of training
C i=1
are 100000.
In the experiments, gray images are adopted from Set11 (Kulkarni
where Θg represents the parameters of the generated network, which
et al., 2016) and BSD68 (Martin et al., 2001) datasets. While for the
includes the parameters of the initial reconstruction network and the
evaluation of color images, we use five color image datasets widely used
deep reconstruction network, ‖⋅‖21 is the l1 norm, and C represents the C in other computer vision tasks as the test sets, which are Set5 (Bev­
pairs of training data. ilacqua et al., 2012), Set14 (Zeyde et al., 2012), Sun-Hays80 (L. Sun &
Adversarial loss is demonstrated effective in improving the visual Hays, 2012), BSD100 (Martin et al., 2001) and Urban100 (Huang et al.,
performance of the reconstructed image. In this paper, WGAN-GP 2015). They have 5, 14, 80, 100 and 100 color images respectively.
(Wasserstein Generative Adversarial Nets-Gradient Penalty)is adop­ These gray and color test sets contain images of various resolutions and
ted, which provides better stability by proposing a penalization of backgrounds, which can fully verify the generalization ability of the
gradient norm. This loss is defined as: algorithm.
1 ∑C
Ladv = − [D(G(Ws X; Θg ); Θd )] (30)
C i=1

9
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Table 1 than the optimized traditional algorithms DWT, TVAL3, MH and GSR.
The average PSNR and SSIM under different measurement rates on Set11 and Therefore, we only compare with the CS methods based on deep
BSD68 datasets of different methods. learning.
Dataset PSNR(dB)/SSIM In this part, we compare the image reconstruction quality and visual
Method MR = 0.25 MR = 0.1 MR = 0.04 MR = 0.01 performance with other comparison algorithms on gray images. Both
Set11 ReconNet+ 27.10 / 23.39 / 19.64 / 16.65 / subjective and objective evaluations are employed to evaluate the
0.821 0.698 0.535 0.372 reconstruction quality. The objective evaluation includes Peak Signal-to-
ISTA-Net 31.78 / 26.26 / 21.14 / 17.36 / Noise Ratio (PSNR) and Structural Similarity Index (SSIM) (Yao et al.,
0.916 0.796 0.596 0.408
2020). The PSNR formula is as follows:
DR2-Net 28.66 / 24.24 / 20.79 / 17.41 /
0.851 0.726 0.579 0.404
1 ∑m− 1 ∑
n− 1
SCGAN 20.32 / 11.45 / 18.05 / 13.11 / MSE = [I(i, j) − KK(i, j)]2 (34)
0.691 0.345 0.628 0.299 mn m=0 n=0
SCSNet –/– 28.51/ 24.29/ 21.04/
0.861 0.632 0.55 ( )
MAXI2
DPA-Net 31.47/0.923 26.99/ 21.50/ 18.05/ PSNR = 10log10 (35)
0.803 0.720 0.501 MSE
CSENMT 33.25/ 28.28/ 24.62/ 21.95/
0.942 0.860 0.712 0.616 where for image KK of size m × n, KK(i, j) represents the plain image
BSD68 ReconNet+ 26.26/0.732 23.35/ 20.93/ 18.46/ pixel, I(i, j) is the reconstructed image pixel. MAX is the maximum value
0.668 0.511 0.397 of the image pixel, MSE represents the mean square deviation. The
ISTA-Net 29.070.844 25.23/ 22.06/ 19.11/
0.695 0.54 0.41
higher PSNR is, the better the reconstructed image quality is.
DR2-Net 27.38/0.782 24.14/ 21.76/ 19.23/ SSIM value is calculated by:
0.702 0.535 0.401
SCGAN 25.91/0.696 12.23/ 20.24/ 13.44/ (2μx μy + c1 )(2σ xy + c2 )
SSIM(x, y) = (36)
0.373 0.526 0.392 (μ2x + μ2y + c1 )(σ 2x + σ2y + c2 )
SCSNet –/– 27.28/ 24.62/ 22.37/
0.77 0.656 0.523 SSIM is a number between 0 and 1. The larger SSIM value is, the
DPA-Net –/– 25.57/ 23.27/ –/– more similar the structure of the reconstructed image is to the plain
0.727 0.610
image.
CSENMT 30.44/ 26.83/ 24.12/ 22.33/
0.906 0.781 0.641 0.529 The test datasets of gray image are the same as the previous deep
learning reconstruction algorithms, which are Set11and BSD68. The
reconstruction results are analyzed at four measurement rates, including
4.2. Compression performance analysis 0.01, 0.04, 0.1 and 0.25. The results of the comparison algorithms
ReconNet+, ISTA-Net and SCSNet are obtained by running the pre-
4.2.1. Comparison on gray images training model published by the original author, and the results of
The CS reconstruction algorithms we compared include ReconNet+ DPA-Net are gotten from the original paper, while the results of SCGAN
(Lohit et al., 2018), ISTA-Net (Zhang & Ghanem, 2018), DR2-Net (Yao are generated by retraining with the author’s code.
et al., 2019), SCGAN (Sun et al., 2020a) SCSNet (Shi et al., 2019b) and Table 1 shows the average PSNR and SSIM of different methods on
DPA-Net (Sun et al., 2020b). These deep learning reconstruction algo­ Set 11 and BSD68 with different CS ratios of 0.25, 0.1, 0.04 and 0.01.
rithms succeed better image reconstruction quality than the traditional The best results are marked with bold font. From this table, our method
ones. Previous studies also demonstrate that the deep learning method gains the highest average PSNR at three sampling ratios on the Set11
achieves better reconstruction quality under each measurement rate

Fig. 8. Comparison of Barbara image reconstruction between CSENMT and other algorithms under 0.1 measurement rate.

10
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Table 2 compared with the best SCSNet algorithm. From the results, one may
The average PSNR and SSIM under different measurement rates of multiple color find that the proposed CSENMT improve the texture information of
image datasets of different methods. reconstructed images to correlate to human visual perception.
Dataset PSNR(dB)/SSIM
Method MR = 0.25 MR = 0.1 MR = 0.04 MR = 0.01 4.2.2. Comparison on color images
Set5 ReconNet+ 30.646/ 27.97/ 24.283/ 19.96/ Except for the gray image reconstruction, CSENMT can also achieve
0.890 0.812 0.676 0.494 good reconstruction quality of color image. As the comparison algo­
ISTA-Net 36.807/ 31.631/ 26.007/ 20.24/ rithms are only suitable for gray image, we run these codes channel by
0.934 0.861 0.721 0.506
channel to recover the color image in part. In this section, we also use
SCGAN 28.38/ 15.31/ 26.53/ 21.32/
0.740 0.286 0.722 0.421 PSNR and SSIM as a quantitative analysis indicator. Since SSIM is only
SCSNet –/– 33.01/ –/– 25.20/ applicable to gray images, we decompose YCbCr color space in color
0.912 0.690 images, and use SSIM to evaluate the reconstruction results on Y-chan­
Ours 41.17/ 34.98/ 30.70/ 25.91/ nel. Since the previous deep learning CS algorithms were only tested on
0.972 0.935 0.869 0.726
gray image datasets, we selected several color image datasets widely
Set14 ReconNet+ 26.96/ 23.885/ 21.06/ 18.11/
0.805 0.682 0.536 0.388 used in the domain of super-resolution as color comparison datasets,
ISTA-Net 31.01/ 26.41/ 22.25/ 18.46/ including Set5, Set14, BSD100, Sun80 and Urban100.
0.874 0.742 0.570 0.403 Table 2 lists the PSNR and SSIM of CSENMT and the comparison
SCGAN 25.61/ 15.232/ 23.282/ 19.21/
algorithms, including ReconNet+, ISTA-Net, SCGAN and SCSNet. In this
0.706 0.257 0.619 0.341
SCSNet –/– 28.43/ –/– 21.89/ table, the best results are marked in bold. From the comparison results,
0.832 0.529 CSENMT significantly outperform state-of-the-art CS methods on all five
Ours 34.59/ 29.91/ 26.39/ 22.50/ testing datasets at multiple measurement rates. Obviously, with the in­
0.927 0.868 0.771 0.573 crease of measurement rate, the average PSNR of our method is still
BSD100 ReconNet+ 24.782/ 22.53/ 20.52/ 18.40/
much higher than that of ISTA-Net, which means that our algorithm is
0.769 0.631 0.502 0.381
ISTA-Net 29.00/ 25.14/ 22.01/ 19.11/ obviously better than other algorithms for color image reconstruction.
0.843 0.686 0.529 0.394 At the measurement rate of 0.25, the average PSNR of our algorithm is
SCGAN 25.589/ 16.238/ 23.227/ 20.411/ 4.3 dB, 3.5 dB, 5.8 dB, 7.4 dB and 4.2 dB higher than that of ISTA-Net on
0.707 0.252 0.602 0.377
Set5, Set14, BSD100, Sun80 and Urban100 datasets, which indicates
SCSNet –/– 25.37/ –/– 22.26/
0.748 0.501
that our method achieves best performance on high measurement rate.
Ours 34.88/ 29.46/ 26.21/ 22.68/ At the measurement rate of 0.1, compared with the optimal method
0.967 0.878 0.755 0.544 SCSNet, the PSNR gains are 1.9 dB, 1.5 dB, 4.1 dB, 2.7 dB and 2.4 dB
Sun80 ReconNet+ 29.28/ 26.291/ 23.619/ 20.957/ respectively on five datasets. The PSNR and SSIM results of all datasets
0.844 0.735 0.610 0.492
demonstrate the superiority of CSENMT over other CS algorithms.
ISTA-Net 32.10/ 27.845/ 24.453/ 21.15/
0.895 0.770 0.634 0.503 Fig. 9 shows the reconstruction results at 0.1 measurement rate
SCGAN 27.207/ 15.90/ 24.90/ 20.69/ compared with other methods, including ReconNet+, ISTA-net, SCGAN,
0.720 0.249 0.651 0.335 SCSNet. As can be seen from this figure, the proposed algorithm achieves
SCSNet –/– 30.01/– –/– 24.41/–
superior performance on visual quality to other four algorithms, and it
Ours 39.53/ 32.73/ 28.88/ 25.00/
0.980 0.923 0.825 0.640
can recover more textures. Even if the measurement rate is low, our
Urban100 ReconNet+ 25.36/ 21.701/ 18.813/ 16.313/ method can still extract finer textures from local regions and transfer
0.796 0.643 0.469 0.315 these textures into the reconstruction image to enhance the visual per­
ISTA-Net 29.29/ 23.58/ 19.45/ 16.39/ formance. Both the qualitative and quantitative evaluation clearly
0.899 0.731 0.510 0.329
demonstrates the effectiveness of the proposed model.
SCGAN 24.00/ 15.117/ 21.09/ 17.34/
0.718 0.260 0.598 0.305 To better show the advantages of our model on reconstruct detailed
SCSNet –/– 25.337/ –/– 19.13/ textures, we compared with these methods by reconstructing four im­
0.820 0.442 ages with rich texture features in Urban100 dataset. The results are
Ours 33.49/ 27.76/ 23.89/ 19.91/
displayed in Fig. 10. The selected four images contain the texture in­
0.959 0.888 0.764 0.512
formation in different scenes, including roof, wall, floor tile and hori­
zontal line. From Fig. 10, it may be seen that the textures of the
dataset, while the SSIM values are also the highest at two measurement comparison algorithms are poor, especially when the texture details are
rates of 0.01 and 0.25. Compared with DPA-Net, the SSIM values are rich. However, the reconstruction results of our methods have more
more than 0.11, 0.06 and 0.02 under 0.01, 0.10 and 0.25 measurement ideal texture information and better visual quality, which further offers
rates, respectively. Especially at 0.01 measurement rate, our average the advantages of our algorithm in texture reconstruction.
SSIM value reaches 0.6 and is at least 0.06 higher than the best method
SCSNet. Moreover, the average SSIM of ours reaches 0.7 at measurement 4.2.3. Ablation experiment
rate of 0.04. The optimization goal of our algorithm is not to increase In this subsection, we validate the effectiveness of the self-learning
PSNR, but to produce visually satisfactory results. Compared with the texture extraction module and perceptual loss.
methods which only aim to achieve higher PSNR, our model still has the Self-learning texture extraction module: As discussed above, the
superior performance on Set11 and BSD68 datasets. Moreover, the al­ texture features extracted from gray image can effectively increase the
gorithm optimizes the color image and gray image synchronously, and reconstruction quality. And this improvement is not only mentioned in
the results also present that our method is valuable for the gray image, gray image, but also reflected in color image. The following Fig. 11
which further illustrates the reliability and effectiveness of CSENMT. displays the reconstruction results of color and gray image at 0.1 sam­
In addition, we reconstructed the Barbara image of each algorithm at pling ratio, where (a), (c) are the reconstruction results with self-
the measurement rate of 0.1. As can be seen from Fig. 8, our algorithm learning texture extraction module and (b), (d) are the reconstruction
possesses good reconstruction quality without block artifact noise at a results without self-learning texture extraction module. Obviously, as
low measurement rate of 0.1. Moreover, besides the overall image regards to Barbara image, the texture module can successfully recover
quality, our method also significantly strengthens the texture details the texture detail in the headscarf. And the module can produce sharper
and more natural Baboon’s whiskers. In other words, the texture

11
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 9. The results at 0.1 measurement rate on multiple datasets of different algorithms.

Fig. 10. The reconstruction results of our algorithm and the comparison algorithm under the measurement rate of 0.1 on the Urban100 dataset.

Fig. 11. Ablation experiment of self-learning texture extraction module.

extraction module can effectively extract the texture features in the information of the three reconstruction images succeeds in extracting
image and enhance the network’s ability to recover the texture details. through our texture extraction module. For color images, it can also
To show the effectiveness of the texture extraction module proposed reconstruct the texture details of color images with the textures trans­
in this paper, Fig. 12 displays the reconstructed image of CSENMT and ferred from Y-channel texture information. Even if the texture infor­
the corresponding texture image at the 0.1 measurement rate. Starting mation increases, the texture extraction module can also obtain the
from the left column, there are Pepper, Flinstons and Img_087 recon­ texture well. The results show that the proposed texture extraction
struction results and texture image. As seen from Fig. 12, the texture module can effectively recover the details of the image.

12
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

points. All data pairs are shown in Fig. 14. If the reconstruction PSNR
values of CSENMT are higher than those of CSENMT-v2, these points
will be above the line, otherwise they will lie below the line. We can see
that all the points locate above the diagonal line on both BSD68 and
Urban 100. Especially on the Urban100 dataset, the data points locate
much higher than the line, verifying that the reconstructed PSNR value
of CSENMT is much higher than that of CSENMT-v2. From Fig. 13 and
Fig. 14, we can see that the transmission perception loss proposed in this
paper plays a significant role in improving the reconstruction quality.

4.3. Comparison with other CS encryption schemes

Fig. 12. Reconstruction image and texture image of CSENMT at 0.1 measure­ To further validate the advantages of CSENMT, we compare it with
ment rate. some CS encryption algorithms. Fig. 15 shows the reconstruction image
of CSENMT and the comparison algorithm on gray image, where (a)-(d)
Transmission perceptual loss: Since the prediction object of are the results of comparison method and (e)-(h) are our results. From
perceptual loss in this paper is Y-channel image, we will extract unsat­ the comparison results, it is obvious that the reconstructed image quality
isfactory texture information if we use the pre-trained vgg19 as of our encryption algorithm is significantly higher than that of the
perceptual loss and the reconstruction result will produce artifact noise. traditional algorithm when the measurement rate is low. Meanwhile, the
In this section, we use the pre-trained vgg19 as the perceptual loss to texture information is richer in our reconstruction image on the Baboon
replace the transmission perceptual loss and name it as CSENMT-v2. image, while the comparison image is blurred.
Fig. 13 shows the results of CSENMT and CSENMT-v2 at 0.25 mea­ In addition, we select two 512 × 512 × 3 images Lena512 and Ba­
surement rate, where (a), (b) are the results of CSENMT, and (c), (d) are boon as plain images, and we sample the plain image at 0.25 measure­
the reconstructed image of CSENMT-v2. It may be seen that there is ment rate. The reconstruction results of our method and the comparison
some artifact noise in the edge area in both the gray image and the color algorithm are shown in Fig. 16, where (a) and (d) are the reconstructed
image, which reduces the visual performance of results. However, the images of our method, (b) and (e) are the reconstructed results of (Chai
reconstructed images of CSENMT present excellent reconstruction et al., 2021), and (c) is the result of (Gan et al., 2021). Fig. 16 illustrates
quality and textures features without artifact noise. These results indi­ that the reconstruction quality of our algorithm is significantly higher
cate the superiority of transmitting perceptual loss. than that of comparison algorithm. CSENMT is superior to the com­
In addition, we test a plenitude of pictures of the gray image dataset parison algorithms in terms of overall image quality and detail infor­
BSD68 and the color image dataset Urban100 at 0.25 measurement rate. mation. Table 3 lists the reconstruction PSNR values of our method and
We arrange the PSNR of each reconstruction image of CSENMT and comparison methods. From Table 3, we can see that the PSNR and SSIM
CSENMT-v2 into data pairs, and we take the PSNR pairs as coordinate values of CSENMT on all color test images are higher than the com­
parison algorithms. Even on the Baboon with the smallest difference, the

Fig. 13. Reconstruction images of the CSENMT and CSENMT-v2 at measurement rate 0.25.

Fig 14. Comparison of CSENMT and CSENMT-v2 at measurement rate 0.25 on the BSD68 (left sub-figure) and Urban100 (right sub-figure) dataset.

13
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 15. Comparison with traditional CS encryption algorithms on gray image.

Fig. 16. Comparison with some compressed sensing encryption algorithms on color image.

Table 3 Table 5
The PSNR comparisons of Lena and Baboon image. Comparison results of key spaces with other algorithms.

Algorithm PSNR/ SSIM Algorithm Ours (Wang et al., (Zhu et al., (Patel et al., (Zhou et al.,
Lena512 Baboon 2021c) 2020) 2021) 2020a)

(Chai et al., 2021) 25.46 dB/0.638 21.68 dB/– Key space >2210 1064 2197 10128 2200
(Gan et al., 2021) 24.05 dB/0.453 N/A
CSENMT 33.33 dB/0.862 24.24 dB/0.795
algorithms. It can be observed from Fig. 16 and Table 5 that CSENMT
has better reconstruction quality than some compressed sensing
methods.
Table 4
Comparison of running time with each CS method (s).
Method MR = 0.25 MR = 0.1 MR = 0.04 MR = 0.01 4.4. Running efficiency
ReconNet+ 0.081 0.078 0.075 0.072
ISTA-Net 0.054 0.051 0.054 0.045 4.4.1. Comparison of running speed with deep CS methods
SCGAN 0.21 0.18 0.153 0.15 Table 4 presents the average running time of 256 × 256 × 3 image
SCSNet – 0.31 – 0.399 between our method and other CS methods based on deep learning.
SWDGAN 0.022 0.025 0.021 0.022
These comparison methods reconstructed image channel by channel in
Ours 0.139 0.134 0.132 0.131
this section, so the running time of these methods for color image
reconstruction reported in this table is three times the average running
proposed CSENMT improved 2.56 dB compared to (Chai et al., 2021). time of gray image with the size of 256 × 256. The results of our algo­
On the Lena512 image, the PSNR value of ours reaches 33.33 dB, which rithm are the average running time of jointly reconstructing color im­
is 7.9 dB and 9.3 dB higher than that of (Chai et al., 2021) and (Gan ages with size of 256 × 256 × 3. From Table 4, we can see that the
et al., 2021) respectively. The SSIM value of our method on Lena512 also running time of our method is close to 0.13 s, which is similar with other
reaches 0.862, which has a better performance against the comparison deep learning-based methods. Meanwhile, our method can obtain better
reconstruction quality with the similar speed.

14
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 17. Encryption and decryption time of Lena (left sub-figure) and Baboon (right sub-figure).

4.4.2. Comparison of running speed with traditional CS encryption method less than 0.15 s, while the decryption time of (Luo et al., 2019) and (Chai
In this section, we compare the running time of encryption and et al., 2018) are 6 s and 2.8 s respectively. This is because the traditional
decryption with several traditional CS encryption methods. Similar to encryption algorithm solves the compressed sensing reconstruction
Section 4.4.1, the test images of the comparison method are gray images, problem through complex iterative operation, which will take a lot of
and the reconstruction method used is channel-by-channel reconstruc­ time. However, our algorithm spends most time in the training process.
tion to recover color image. Therefore, the running time of the com­ During sampling and reconstruction, we can directly run the trained
parison method is three times that of each comparison method in model to obtain the desired results, which greatly shortens the running
reconstructing the gray image with the size of 256 × 256, while the time time. Therefore, the running efficiency of the proposed CSENMT is much
of our algorithm is still the average running time of 256 × 256 × 3 better than the traditional CS encryption methods.
image.
Fig. 17 shows the encryption and decryption times between CSENMT 4.5. Security analysis
and several comparison algorithms. We can find that the encryption and
decryption time of CSENMT with (Chai et al., 2022c) are approximately 4.5.1. Key space analysis
the same. One may find that the encryption and decryption times of Large key space can effectively increase the complexity of the
CSENMT on two images are in most part much less than those of the encryption system and resist some violent attacks. The keys of our
comparison algorithms. Especially on the Lena image, the encryption encryption algorithm include: (1) external keys t1, t2, t3 and t4; (2) initial
time of CSENMT is close to 0, while the times of (Chai et al., 2018) and setting parameter k. When the computing accuracy of the computer is
(Chai et al., 2020) need at least 1 s. The encryption time of (Luo et al., 10-14, the key space of the key t1 is 1014. Similarly, t2, t3, t4 and k have the
2019) on Lena image even exceeds 1.5 s, which is 75 times that of same key space as key t1. Therefore, the key space of our algorithm is
CSENMT. On the Baboon image, the decryption time of our method is 10(14)5 = 1070 > 2210, which is much larger than the theoretically secure

Fig. 18. Histogram results of different plain images and their cipher images.

15
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 19. Correlation of the adjacent pixel of plain and encrypted images.

key space 2100. The comparison results of key space of ours with other
Table 6
encryption algorithms are listed in Table 5. From Table 5, we can see
Comparison with other algorithms on correlation coefficients of gray image.
that the key space of the proposed algorithm is larger than that of in (Zhu
et al., 2020) and (Zhou et al., 2020a). According to the results, our Image Algorithm Correlation coefficients
Horizontal Vertical Diagonal
proposed encryption algorithm has enough key space to resist any
exhaustive attack. Besides, the network parameters of our network can Lena Plain image 0.9697 0.9438 0.9187
CSENMT 0.0003 − 0.0136 0.0024
also be used as the decryption key, which can make the key space larger.
(Luo et al., 2019) 0.0069 − 0.0028 − 0.0047
However, considering that the existing key is enough to resist violent (Wang et al., 2021c) − 0.0135 0.0227 0.0037
attack, we do not use network parameters as keys to reduce the network (Zhang, 2021) 0.0134 0.0134 0.0134
transmission pressure and the complexity of key management. (Wang et al., 2023) 0.0130 0.0141 − 0.0258
(Yang et al., 2023) − 0.0107 − 0.0079 − 0.0014
(Haider et al., 2023) 0.0012 0.0006 0.0021
4.5.2. Histogram analysis
(Ye et al., 2022) − 0.0319 − 0.0140 0.0065
Histogram distribution of encrypted image is one of the basic stan­ House Plain image 0.9639 0.9797 0.9493
dards to evaluate the security and effectiveness of encryption system. CSENMT 0.0077 − 0.0046 − 0.0014
The histogram of encrypted image should present a similar histogram (Luo et al., 2019) 0.0065 0.0072 − 0.0044
(Wang et al., 2023c) 0.0209 0.0176 0.0168
distribution even if the plain image is different, and the histogram of − −
(Yang et al., 2023) − 0.0086 − 0.0123 0.0022
cipher image is significantly different from that of plain image. (Zhou et al., 2023) − 0.0009 0.0007 0.0008
Fig. 18 shows the histogram of plain image and cipher image of three Cameraman Plain image 0.9596 0.9391 0.9185
512 × 512 × 3 color images. Among them, (a)-(c) are the histograms of CSENMT − 0.0024 0.03017 − 0.0038
plain images, and (d)-(f) are the histogram of the corresponding cipher (Luo et al., 2019) − 0.0044 − 0.0054 0.0025
(Gao et al., 2022) 0.0014 0.0039 0.0098
image at the measurement rate of 0.25. From Fig. 18, it may be seen that

(Ye et al., 2022) − 0.0130 − 0.0120 − 0.0076
the histogram of plain image has obvious pixel distribution, and there (Wang & Du, 2022) − 0.0014 − 0.0046 − 0.0032
are still obvious pixel distribution differences among the three RGB Peppers Plain image 0.952 0.9436 0.916
channels. While the histograms of the cipher images are almost the CSENMT 0.001 − 0.0089 0.0165
same, the pixel distribution among the three color channels is almost (Luo et al., 2019) 0.0074 0.0035 0.0041
(Wang et al., 2023) 0.0069 0.0259 − 0.0060
always the same. In particular, the pixel distribution of the three (Zhang et al., 2023) − 0.0562 − 0.0106 − 0.0166
channels is almost hidden in the histogram of the R channel. This result (Chen et al., 2022) 0.0101 0.0196 − 0.0013
presents that the proposed encryption method can hide the pixel dis­
tribution of plain image, and it can resist statistical analysis attacks and
addresses the energy leakage issue well. images are shown in Fig. 19, where (a), (b) are the adjacent pixel cor­
relation of Barbara’s plain image and cipher image respectively, and (c),
4.5.3. Correlation analysis of adjacent pixels (d) represent the adjacent pixel correlation of parrot’s plain image and
The correlation between adjacent pixels can be evaluated by calcu­ cipher image correspondingly. It can be seen from Fig. 19 that the
lating the correlation coefficient. For a plain image, there is always a adjacent pixels of the plain image have strong correlation in the hori­
strong correlation between its adjacent pixels, and the correlation co­ zontal, vertical, and diagonal directions, while the correlation of the
efficient is usually close to 1. The correlation coefficient of the encrypted corresponding cipher image is weak.
image is expected to be close to 0, indicating that the correlation be­ Table 6 illustrates the correlation coefficients of our method and
tween the encrypted images is greatly eliminated. The correlation comparison encryption method. As the results indicate, the correlation
calculation formula of adjacent pixels is as follows: of plain image is strong, which is almost greater than 0.90. However, the
correlations of our encrypted image are extremely close to 0, which
rx,y =
E((x − E(x))(y − E(y))
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ (37) proves its ability to destroy the correlation of plain images. By compared
D(x)D(y) with other encryption methods, the correlation of CSENMT is compa­
rable to other encryption methods, and it can obtain good scrambling
∑ ∑
where E(x) = (1/N) Ni=1 xi , D(x) = (1/N) Ni=1 (xi − E(x))2 are the mean effect.
and variance of x respectively, rx,y represents the correlation coefficient Adjacent pixel correlation of color image. Since our algorithm is
between pixels x and y, and N is the total number of randomly selected also applicable for color image, the adjacent pixels of two color plain
pixels. images and their cipher images are presented in Fig. 20. The (a), (b), (c),
In order to assess the ability of the proposed algorithm to resist (d) sub-images are the Baboon plain image and its adjacent pixel dis­
statistical attacks, we analyze the adjacent pixel distribution of gray tribution in three directions on RGB components, (e), (f), (g), (h) are the
image and color image. corresponding cipher image and its adjacent pixel distribution in RGB
Adjacent pixel correlation of gray image. The pixel distribution of components. Similarly, (i)-(p) are the plain image, cipher image and
plain image Barbara and Parrots sized of 256 × 256 and their cipher pixel distribution of Lena512. It is observed that the adjacent pixel

16
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Fig. 20. Adjacent pixel correlation of different plain color images and encryption results.

distributions in the three directions of the plain image are close to an scrambling. It is obvious from Fig. 21 that the color distribution of RGB
inclined straight line on RGB channels. However, the pixel distributions channel can still be seen from the cipher image using the channel-by-
of cipher image show weak correlation, which demonstrates the security channel scrambling. As we add an inter channel permutation in APCP,
of our encryption method. the pixels between the channels of color image are fully scrambled. The
Furthermore, Table 7 lists the correlation coefficients of adjacent visual performance of the cipher image is more uniform than that of the
pixels in three directions. It may be found that the correlation co­ channel-by-channel scrambling. Meanwhile, the three cipher images are
efficients of color plain images are greater than 0.9, which shows that almost the same, which indicates that APCP can mask the distribution of
the strong correlation of adjacent pixels. However, CSENMT have different cipher images. The GDD results of the two scrambling methods
comparable correlation coefficient of the encrypted image with the are shown in Fig. 22. The GDD values of APCP is higher than that of
existing advanced encryption algorithms, which means weak correla­ channel-by-channel scrambling. Especially for Lena image, the GDD
tion. This shows the high security level of CSENMT. value is increasing from 0.93 to 0.96, which shows the effectiveness of
APCP. The experimental results show that the permutation effect of
4.5.4. Effectiveness of APCP adaptive scrambling proposed in this paper is higher than the channel-
In order to prove the effectiveness of APCP compared to channel-by- by-channel scrambling methods.
channel scrambling, we investigate the performance of APCP in terms of
scrambling degree. Except for the scrambling images of some color 5. Conclusion
images, we introduce gray difference degree (GDD) to calculate the gray
difference value between plain image and cipher image quantitatively. Based on compressed sensing and chaotic system, a deep image
The value of GDD is between − 1 and 1. The closer the absolute value of compressed sensing encryption network using multi-color space and
scrambling degree is to 1, the better the scrambling effect is. The closer texture feature is proposed. CSENMT may take full advantage of the
GDD of cipher image is to 1, the better the scrambling degree is. difference and inter-correlations between RGB channels and the texture
Fig. 21 shows the scrambled results of APCP and channel-by-channel information of Y channel, which effectively improves the reconstruction

17
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Table 7
Correlation coefficients of different color images and corresponding encrypted
images.
Image Algorithm Channel Correlation coefficient
Horizontal Vertical Diagonal

Lena512 Plain image R 0.9854 0.9746 0.9648


G 0.9798 0.9675 0.9498
B 0.9555 0.9324 0.9148
CSENMT R − 0.0065 0.0056 − 0.0094
G 0.0015 0.0072 0.002
B − 0.003 − 0.0142 0.0055
(Xue & Zi, R 0.0092 0.0203 − 0.0073
2020)
G 0.0002 − 0.0025 − 0.0131
B 0.0076 0.0006 0.0111
(Zhou et al., R 0.0083 –0.0049 –0.0095
2020b)
G –0.0054 0.01 –0.0017
B –0.0010 0.0124 –0.0042
(Zhang et al., R 0.001365 0.004776 0.000232
2020b)
G 0.003294 − 0.000579 0.004807
B 0.00206 0.000194 − 0.004043
(Xin et al., R 0.0006 − 0.0012 0.0008
2023)
G − 0.0004 − 0.0007 0.0007
B 0.0001 0.0005 0.0006
Peppers Plain image R 0.9668 0.9595 0.955
G 0.9826 0.9818 0.9698
Fig. 21. Scrambling results of channel-by-channel scrambling and APCP.
B 0.9714 0.96351 0.9472
CSENMT R 0.0064 − 0.0252 0.0018
G − 0.006 0.0206 0.0052
B − 0.0228 0.0004 0.001
(Xue & Zi, R 0.0049 0.0006 0.0051
2020)
G 0.0075 − 0.0012 0.0343
B 0.0182 0.0025 0.0074
(Xin et al., R − 0.0005 0.0004 0.0007
2023)
G − 0.0004 0.0002 0.0004
B − 0.0006 0.0003 0.0006
Baboon Plain image R 0.9019 0.9409 0.888
G 0.8359 0.8878 0.7933
B 0.91132 0.926913 0.8748
CSENMT R 0.0062 0.0053 − 0.0026
G 0.0058 0.0139 0.0217
B − 0.007 − 0.0129 0.0031
(Zhou et al., R –0.0023 0.0014 0.0155
2020b)
G –0.0115 –0.0178 0.0044 Fig. 22. Test results of GDD.
B 0.0066 –0.0089 − 0.0132
(Zhang et al., R 0.0013 0.0046 0.0003
satisfactory results in security, reconstruction accuracy, visual effect and
2020b)
G − 0.0081 0.0008 0.0053 operation efficiency on both gray images and color images.
B − 0.0088 0.0001 0.0017 As for the disadvantages, CSENMT uses the sampling network to
(Li et al., R 0.0002 0.0012 0.0003 measure, while most traditional CS algorithms still use the traditional
2023)
measurement matrix to obtain the measurement value, such as random
G 0.0013 − 0.0008 − 0.0011
B − 0.0019 − 0.0015 − 0.0010 Gaussian matrix and Bernoulli matrix. Therefore, considering a blind
reconstruction network to be suitable for the reconstruction tasks of
different measurement matrices, rather than limited to the adaptive
quality of color image and gray image. Meanwhile, the proposed texture measurement matrix, is also a topic worthy of research.
extraction module can focus on extracting the texture features of Y-
channel image and improve the texture detail quality of gray and color Conflict of interest
reconstruction image. The transmission perceptual loss calculates the
feature by use of the proposed texture extraction module, which can The authors declare that they have no conflict of interest.
effectively reduce the artifacts and noise that may be generated by the
traditional perceptual loss. In addition, CSENMT is suitable for both CRediT authorship contribution statement
channel-by-channel CS and cross channel CS. The APCP realize the inter
channel permutation instead of channel-by-channel permutation, and Xiuli Chai: Data curation, Validation, Writing – review & editing.
the scrambling degree of color image is improved. Moreover, the plain Shiping Song: Data curation, Writing – original draft. Zhihua Gan:
information is used to generate the chaotic sequence, so that our algo­ Conceptualization, Methodology, Software. Guoqiang Long: Data
rithm is highly related to the plain image, and it may resist known- curation, Writing – review & editing. Ye Tian: Data curation, Writing –
plaintext and chosen-plaintext attacks. The experimental results illus­ original draft. Xin He: Writing – review & editing.
trate that the proposed image encryption method has achieved

18
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Declaration of Competing Interest Mathematics and Computers in Simulation, 209, 153–168. https://doi.org/10.1016/j.
matcom.2023.01.036
Hu, Y., Zhang, X., Chen, D., Yan, Z., Shen, X., Yan, G., et al. (2022). Spatiotemporal
The authors declare that they have no known competing financial Flexible Sparse Reconstruction for Rapid Dynamic Contrast-Enhanced MRI. IEEE
interests or personal relationships that could have appeared to influence Transactions on Biomedical Engineering, 69(1), 229–243. https://doi.org/10.1109/
the work reported in this paper. TBME.2021.3091881
Huang, J., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed
self-exemplars. Proceedings of the IEEE Conference on Computer Vision and Pattern
Data availability Recognition, 5197–5206. 10.1109/cvpr.2015.7299156.
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer
and super-resolution. Computer Vision–ECCV 2016: 14th European Conference,
Data will be made available on request. Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, 694–711.
10.1007/978-3-319-46475-6_43.
Acknowledgments Kulkarni, K., Lohit, S., Turaga, P., Kerviche, R., & Ashok, A. (2016). Reconnet: Non-
iterative reconstruction of images from compressively sensed measurements.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
All the authors are deeply grateful to the editors for smooth and fast 449–458. 10.1109/CVPR.2016.55.
handling of the manuscript. The authors would also like to thank the Li, D., Li, J., Di, X., & Li, B. (2023). Design of cross-plane colour image encryption based
on a new 2D chaotic map and combination of ECIES framework. Nonlinear Dynamics,
anonymous referees for their valuable suggestions to improve the 111(3), 2917–2942. https://doi.org/10.1007/s11071-022-07949-8
quality of this paper. This work is supported by the National Natural Lohit, S., Kulkarni, K., Kerviche, R., Turaga, P., & Ashok, A. (2018). Convolutional neural
Science Foundation of China (Grant Nos. 61802111, 61872125), the networks for noniterative reconstruction of compressively sensed images. IEEE
Transactions on Computational Imaging, 4(3), 326–340. https://doi.org/10.1109/
Science and Technology Project of Henan Province (Grant Nos.
TCI.2018.2846413
232102210109, 232102210096, 232102211089, 232102211056), Lu, L., Li, W., Tao, X., Lu, J., & Jia, J. (2021). Masa-sr: Matching acceleration and spatial
Open Foundation of Henan Key Laboratory of Cyberspace Situation adaptation for reference-based image super-resolution. Proceedings of the IEEE/CVF
Awareness (Grant No. HNTS2022019), Pre-research Project of SongShan Conference on Computer Vision and Pattern Recognition, 6364–6373. 10.1109/
CVPR46437.2021.00630.
Laboratory (Grant No. YYJC012022011), Key Scientific Research Pro­ Luo, Y., Lin, J., Liu, J., Wei, D., Cao, L., Zhou, R., et al. (2019). A robust image encryption
jects of Colleges and Universities of Henan Province (Grant Nos. algorithm based on Chua’s circuit and compressive sensing. Signal Processing, 161,
23A520011, 24A520003) and the Graduate Talent Program of Henan 227–247. https://doi.org/10.1016/j.sigpro.2019.03.022
Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented
University (Grant Nos. SYLYC2022193 and SYLAL2023020). natural images and its application to evaluating segmentation algorithms and
measuring ecological statistics. Proceedings Eighth IEEE International Conference on
References Computer Vision. ICCV 2001, 2, 416–423 vol.2. 10.1109/ICCV.2001.937655.
Mun, S., & Fowler, J. E. (2009). Block compressed sensing of images using directional
transforms. 2009 16th IEEE International Conference on Image Processing (ICIP),
Agustsson, E., & Timofte, R. (2017). Ntire 2017 challenge on single image super-
3021–3024. 10.1109/ICIP.2009.5414429.
resolution: Dataset and study. Proceedings of the IEEE Conference on Computer Vision
Nagesh, P., & Li, B. (2009). Compressive imaging of color images. 2009 IEEE International
and Pattern Recognition Workshops, 126–135. 10.1109/CVPRW.2017.150.
Conference on Acoustics, Speech and Signal Processing, 1261–1264. 10.1109/
Bevilacqua, M., Roumy, A., Guillemot, C., & Alberi-Morel, M. L. (2012). Low-complexity
ICASSP.2009.4959820.
single-image super-resolution based on nonnegative neighbor embedding. In: Proceedings
Njitacke, Z. T., Tsafack, N., Ramakrishnan, B., Rajagopal, K., Kengne, J., &
of the 23rd British Machine Vision Conference (BMVC), 135.1-135.10. https://doi.org/
Awrejcewicz, J. (2021). Complex dynamics from heterogeneous coupling and
10.5244/C.26.135.
electromagnetic effect on two neurons: Application in images encryption. Chaos,
Cambareri, V., Mangia, M., Pareschi, F., Rovatti, R., & Setti, G. (2015). Low-complexity
Solitons & Fractals, 153, Article 111577. https://doi.org/10.1016/j.
multiclass encryption by compressed sensing. IEEE Transactions on Signal Processing,
chaos.2021.111577
63(9), 2183–2195. https://doi.org/10.1109/TSP.2015.2407315
Patel, S., Thanikaiselvan, V., Pelusi, D., Nagaraj, B., Arunkumar, R., & Amirtharajan, R.
Candès, E. J. (2006). Compressive sampling. Proceedings of the International Congress of
(2021). Colour image encryption based on customized neural network and DNA
Mathematicians, 3, 1433–1452.
encoding. Neural Computing and Applications, 33(21), 14533–14550. https://doi.org/
Candes, E. J., & Tao, T. (2006). Near-optimal signal recovery from random projections:
10.1007/s00521-021-06096-2
Universal encoding strategies? IEEE Transactions on Information Theory, 52(12),
Ren, Y., Ren, H., Shi, C., Zhang, X., Wu, X., Li, X., et al. (2022). Multistage semantic-
5406–5425. https://doi.org/10.1109/TIT.2006.885507
aware image inpainting with stacked generator networks. International Journal of
Chai, X., Zheng, X., Gan, Z., Han, D., & Chen, Y. (2018). An image encryption algorithm
Intelligent Systems, 37(2), 1599–1617. https://doi.org/10.1002/int.22687
based on chaotic system and compressive sensing. Signal Processing, 148, 124–144.
Shi, W., Jiang, F., Zhang, S., & Zhao, D. (2017). Deep networks for compressed image
https://doi.org/10.1016/j.sigpro.2018.02.007
sensing. IEEE International Conference on Multimedia and Expo (ICME), 2017,
Chai, X., Bi, J., Gan, Z., Liu, X., Zhang, Y., & Chen, Y. (2020). Color image compression
877–882. https://doi.org/10.1109/ICME.2017.8019428
and encryption scheme based on compressive sensing and double random encryption
Shi, W., Jiang, F., Liu, S., & Zhao, D. (2019). Image compressed sensing using
strategy. Signal Processing, 176, Article 107684. https://doi.org/10.1016/j.
convolutional neural network. IEEE Transactions on Image Processing, 29, 375–388.
sigpro.2020.107684
https://doi.org/10.1109/TIP.2019.2928136
Chai, X., Wu, H., Gan, Z., Han, D., Zhang, Y., & Chen, Y. (2021). An efficient approach for
Shi, W., Jiang, F., & Liu, S. (2019). Scalable convolutional neural network for image
encrypting double color images into a visually meaningful cipher image using 2D
compressed sensing. IEEE. CVF Conference on Computer Vision and Pattern Recognition
compressive sensing. Information Sciences, 556, 305–340. https://doi.org/10.1016/j.
(CVPR), 12290-12299. 10.1109/CVPR.2019.01257.
ins.2020.10.007
Su, Q., Sun, Y., Zhang, X., Wang, H., Wang, G., & Yao, T. (2022). A watermarking scheme
Chai, X., Wang, Y., Gan, Z., Chen, X., & Zhang, Y. (2022). Preserving privacy while
for dual-color images based on URV decomposition and image correction.
revealing thumbnail for content-based encrypted image retrieval in the cloud.
International Journal of Intelligent Systems, 37(10), 7548–7570. https://doi.org/
Information Sciences, 604, 115–141. https://doi.org/10.1016/j.ins.2022.05.008
10.1002/int.22893
Chai, X., Wang, Y., Chen, X., Gan, Z., & Zhang, Y. (2022). TPE-GAN: Thumbnail
Su, Y., & Lian, Q. (2020). iPiano-Net: Nonconvex optimization inspired multi-scale
Preserving Encryption Based on GAN With Key. IEEE Signal Processing Letters, 29,
reconstruction network for compressed sensing. Signal Processing: Image
972–976. https://doi.org/10.1109/LSP.2022.3163685
Communication, 89, Article 115989. https://doi.org/10.1016/j.image.2020.115989
Chai, X., Tian, Y., Gan, Z., Lu, Y., Wu, X., & Long, G. (2022). A robust compressed sensing
Sun, L., & Hays, J. (2012). Super-resolution from internet-scale scene matching. IEEE
image encryption algorithm based on GAN and CNN. Journal of Modern Optics, 69(2),
International Conference on Computational Photography (ICCP), 2012, 1–12. https://
103–120. https://doi.org/10.1080/09500340.2021.2002450
doi.org/10.1109/ICCPhot.2012.6215221
Chen, H., Bai, E., Jiang, X., & Wu, Y. (2022). A Fast Image Encryption Algorithm Based
Sun, Y., Chen, J., Liu, Q., & Liu, G. (2020). Learning image compressed sensing with sub-
on Improved 6-D Hyper-Chaotic System. IEEE Access, 10, 116031–116044. https://
pixel convolutional generative adversarial network. Pattern Recognition, 98, Article
doi.org/10.1109/ACCESS.2022.3218668
107051. https://doi.org/10.1016/j.patcog.2019.107051
Gan, Z., Bi, J., Ding, W., & Chai, X. (2021). Exploiting 2D compressed sensing and
Sun, Y., Chen, J., Liu, Q., Liu, B., & Guo, G. (2020). Dual-path attention network for
information entropy for secure color image compression and encryption. Neural
compressed sensing image reconstruction. IEEE Transactions on Image Processing, 29,
Computing and Applications, 33, 12845–12867. https://doi.org/10.1007/s00521-
9482–9495. https://doi.org/10.1109/TIP.2020.3023629
021-05937-4
Tian, Y., Chai, X., Gan, Z., Lu, Y., Zhang, Y., & Song, S. (2021). SWDGAN: GAN-based
Gao, X., Mou, J., Xiong, L., Sha, Y., Yan, H., & Cao, Y. (2022). A fast and efficient
sampling and whole image denoising network for compressed sensing image
multiple images encryption based on single-channel encryption and chaotic system.
reconstruction. Journal of Electronic Imaging, 30(6), 63017. https://doi.org/10.1117/
Nonlinear Dynamics, 108(1), 613–636. https://doi.org/10.1007/s11071-021-07192-
1.JEI.30.6.063017
7
Wang, J., Wang, W., & Chen, J. (2022). Adaptive Rate Block Compressive Sensing Based
Haider, M. I., Shah, T., Ali, A., Shah, D., & Khalid, I. (2023). An Innovative approach
on Statistical Characteristics Estimation. IEEE Transactions on Image Processing, 31,
towards image encryption by using novel PRNs and S-boxes Modeling techniques.
734–747. https://doi.org/10.1109/TIP.2021.3135476

19
X. Chai et al. Expert Systems With Applications 241 (2024) 122562

Wang, K., Wu, X., & Gao, T. (2021). Double color images compression–encryption via 2010. Lecture Notes in Computer Science, 6920, 711-730. 10.1007/978-3-642-27413-
compressive sensing. Neural Computing and Applications, 33(19), 12755–12776. 8_47.
https://doi.org/10.1007/s00521-021-05921-y Zhang, B., Xiao, D., & Xiang, Y. (2021). Robust Coding of Encrypted Images via 2D
Wang, M., Xiao, D., & Xiang, Y. (2021). Low-Cost and Confidentiality-Preserving Multi- Compressed Sensing. IEEE Transactions on Multimedia, 23, 2656–2671. https://doi.
Image Compressed Acquisition and Separate Reconstruction for Internet of org/10.1109/TMM.2020.3014489
Multimedia Things. IEEE Internet of Things Journal, 8(3), 1662–1673. https://doi. Zhang, J., Zhao, D., & Gao, W. (2014). Group-Based Sparse Representation for Image
org/10.1109/JIOT.2020.3015237 Restoration. IEEE Transactions on Image Processing, 23(8), 3336–3351. https://doi.
Wang, X. Y., Ren, Q., & Jiang, D. H. (2021). An adjustable visual image cryptosystem org/10.1109/TIP.2014.2323127
based on 6D hyperchaotic system and compressive sensing. Nonlinear Dynamics, 104 Zhang, J., & Ghanem, B. (2018). ISTA-Net: Interpretable optimization-inspired deep
(4), 4543–4567. https://doi.org/10.1007/s11071-021-06488-y network for image compressive sensing. Proceedings of the IEEE Conference on
Wang, X., & Du, X. (2022). Pixel-level and bit-level image encryption method based on Computer Vision and Pattern Recognition, 1828–1837.
Logistic-Chebyshev dynamic coupled map lattices. Chaos, Solitons & Fractals, 155, Zhang, Y., Zhang, L. Y., Zhou, J., Liu, L., Chen, F., & He, X. (2016). A review of
Article 111629. https://doi.org/10.1016/j.chaos.2021.111629 compressive sensing in information security field. IEEE Access, 4, 2507–2519.
Wang, X., Zhao, M., Feng, S., & Chen, X. (2023). An image encryption scheme using bit- https://doi.org/10.1109/ACCESS.2016.2569421
plane cross-diffusion and spatiotemporal chaos system with nonlinear perturbation. Zhang, Y., Wang, P., Huang, H., Zhu, Y., Xiao, D., & Xiang, Y. (2020). Privacy-assured
Soft Computing, 27(3), 1223–1240. https://doi.org/10.1007/s00500-022-07706-4 FogCS: Chaotic compressive sensing for secure industrial big image data processing
Xin, J., Hu, H., & Zheng, J. (2023). 3D variable-structure chaotic system and its in fog computing. IEEE Transactions on Industrial Informatics, 17(5), 3401–3411.
application in color image encryption with new Rubik’s Cube-like permutation. https://doi.org/10.1109/TII.2020.3008914
Nonlinear Dynamics, 111(8), 7859–7882. https://doi.org/10.1007/s11071-023- Zhang, Y., He, Y., Li, P., & Wang, X. (2020). A new color image encryption scheme based
08230-2 on 2DNLCML system and genetic operations. Optics and Lasers in Engineering, 128,
Xue, K., & Zi, G. (2020). A new color image encryption scheme based on DNA encoding Article 106040. https://doi.org/10.1016/j.optlaseng.2020.106040
and spatiotemporal chaotic system. Signal Processing: Image Communication, 80, Zhang, Y. (2021). A new unified image encryption algorithm based on a lifting
Article 115670. https://doi.org/10.1016/j.image.2019.115670 transformation and chaos. Information Sciences, 547, 307–327. https://doi.org/
Yang, Y., Wang, B., Zhou, Y., Shi, W., & Liao, X. (2023). Efficient color image encryption 10.1016/j.ins.2020.07.058
by color-grayscale conversion based on steganography. Multimedia Tools and Zhang, Y., Chen, A., & Chen, W. (2023). The unified image cryptography algorithm based
Applications, 82(7), 10835–10866. https://doi.org/10.1007/s11042-022-13689-z on finite group. Expert Systems with Applications, 212, Article 118655. https://doi.
Yao, H., Dai, F., Zhang, S., Zhang, Y., Tian, Q., & Xu, C. (2019). Dr2-net: Deep residual org/10.1016/j.eswa.2022.118655
reconstruction network for image compressive sensing. Neurocomputing, 359, Zhou, J., Zhou, N., & Gong, L. (2020). Fast color image encryption scheme based on 3D
483–493. https://doi.org/10.1016/j.neucom.2019.05.006 orthogonal Latin squares and matching matrix. Optics & Laser Technology, 131,
Yao, X., Wu, Q., Zhang, P., & Bao, F. (2020). Weighted adaptive image super-resolution Article 106437. https://doi.org/10.1016/j.optlastec.2020.106437
scheme based on local fractal feature and image roughness. IEEE Transactions on Zhou, K., Fan, J., Fan, H., & Li, M. (2020). Secure image encryption scheme using double
Multimedia, 23, 1426–1441. https://doi.org/10.1109/TMM.2020.2997126 random-phase encoding and compressed sensing. Optics & Laser Technology, 121,
Ye, G., Pan, C., Dong, Y., Shi, Y., & Huang, X. (2020). Image encryption and hiding Article 105769. https://doi.org/10.1016/j.optlastec.2019.105769
algorithm based on compressive sensing and random numbers insertion. Signal Zhou, S., Wang, X., & Zhang, Y. (2023). Novel image encryption scheme based on chaotic
Processing, 172, Article 107563. https://doi.org/10.1016/j.sigpro.2020.107563 signals with finite-precision error. Information Sciences, 621, 782–798. https://doi.
Ye, G., Wu, H., Liu, M., & Shi, Y. (2022). Image encryption scheme based on blind org/10.1016/j.ins.2022.11.104
signature and an improved Lorenz system. Expert Systems with Applications, 205, Zhu, L., Song, H., Zhang, X., Yan, M., Zhang, T., Wang, X., et al. (2020). A robust
Article 117709. https://doi.org/10.1016/j.eswa.2022.117709 meaningful image encryption scheme based on block compressive sensing and SVD
Zeyde, R., Elad, M., & Protter, M. (2012). On Single Image Scale-Up Using Sparse- embedding. Signal Processing, 175, Article 107629. https://doi.org/10.1016/j.
Representations. In: Boissonnat, JD., et al. Curves and Surfaces. Curves and Surfaces sigpro.2020.107629

20

You might also like