0% found this document useful (0 votes)
3 views

2024_MAN–C_A Masked Autoencoder Neural Cryptography Based Encryption Scheme for CT Scan Images

The document presents MAN–C, a Masked Autoencoder Neural Cryptography based encryption scheme designed for securely sharing CT scan images. This method combines masked autoencoders with neural cryptography, specifically utilizing a Tree Parity Machine and Shamir's Secret Sharing for effective image encryption and decryption. Evaluated on a dataset from The Cancer Imaging Archive, MAN–C demonstrates improved performance in preserving image quality during the encryption process compared to existing techniques.

Uploaded by

ijiron
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

2024_MAN–C_A Masked Autoencoder Neural Cryptography Based Encryption Scheme for CT Scan Images

The document presents MAN–C, a Masked Autoencoder Neural Cryptography based encryption scheme designed for securely sharing CT scan images. This method combines masked autoencoders with neural cryptography, specifically utilizing a Tree Parity Machine and Shamir's Secret Sharing for effective image encryption and decryption. Evaluated on a dataset from The Cancer Imaging Archive, MAN–C demonstrates improved performance in preserving image quality during the encryption process compared to existing techniques.

Uploaded by

ijiron
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

MethodsX 12 (2024) 102738

Contents lists available at ScienceDirect

MethodsX
journal homepage: www.elsevier.com/locate/methodsx

MAN–C: A masked autoencoder neural cryptography based


encryption scheme for CT scan images
Kishore Kumar a,∗, Sarvesh Tanwar a, Shishir Kumar b
a
Amity Institute of Information Technology, Amity University, Noida, Uttar Pradesh, India
b
School of Information Science and Technology, Babasaheb Bhimrao Ambedkar University, Lucknow, Uttar Pradesh, India

a r t i c l e i n f o a b s t r a c t

Method name: Sharing medical images securely is very important towards keeping patients’ data confidential.
MAN–C: a Masked Autoencoder Neural In this paper we propose MAN–C: a Masked Autoencoder Neural Cryptography based encryp-
Cryptography based Encryption Scheme for CT tion scheme for sharing medical images. The proposed technique builds upon recently proposed
Scan Images
masked autoencoders. In the original paper, the masked autoencoders are used as scalable self-
Keywords:
supervised learners for computer vision which reconstruct portions of originally patched images.
Secret sharing Here, the facility to obfuscate portions of input image and the ability to reconstruct original images
Auto encoder is used an encryption-decryption scheme. In the final form, masked autoencoders are combined
Tree parity machine with neural cryptography consisting of a tree parity machine and Shamir Scheme for secret image
Hebbian learning sharing. The proposed technique MAN–C helps to recover the loss in image due to noise during
Image encryption secret sharing of image.
Masked transformer
Neural cryptography • Uses recently proposed masked autoencoders, originally designed as scalable self-supervised
learners for computer vision, in an encryption-decryption setup.
• Combines autoencoders with neural cryptography - the advantage our proposed approach
offers over existing technique is that (i) Neural cryptography is a new type of public key cryp-
tography that is not based on number theory, requires less computing time and memory and is
non-deterministic in nature, (ii) masked auto-encoders provide additional level of obfuscation
through their deep learning architecture.
• The proposed scheme was evaluated on dataset consisting of CT scans made public by The
Cancer Imaging Archive (TCIA). The proposed method produces better RMSE values between
the input the encrypted image and comparable correlation values between the input and the
output image with respect to the existing techniques.

Specifications Table
Subject area: Computer Science
More specific subject area: Cryptography, Deep Learning
Name of your method: MAN–C: a Masked Autoencoder Neural Cryptography based Encryption Scheme for CT Scan Images
Name and reference of original method: Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick; Masked Autoencoders Are
Scalable Vision Learners, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition (CVPR), 2022, pp. 16,000–16,009. https://ieeexplore.ieee.org/document/9879206
Resource availability: https://imaging.cancer.gov/informatics/cancer_imaging_archive.htm
https://github.com/facebookresearch/mae
https://github.com/oke-aditya/neural_encryption_networks


Corresponding author.
E-mail address: [email protected] (K. Kumar).

https://doi.org/10.1016/j.mex.2024.102738
Received 1 February 2024; Accepted 27 April 2024
Available online 28 April 2024
2215-0161/© 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC license
(http://creativecommons.org/licenses/by-nc/4.0/)
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Background

One of the main goals in the modern world is to protect the confidentiality of information from unauthorized access. This has led
to the development of numerous data encryption techniques [1–3]. Symmetric key encryption and asymmetric key encryption are
two broad levels of encryption techniques. The symmetric key encryption technique [4] uses a shared key that is only known by the
sender and receiver. On the other hand, in the asymmetric key encryption technique [5], the receiver has two keys: a private key and
a public key. The private key is known only to the receiver, while the public key is known to everyone. The sender uses the receiver’s
public key to encrypt the message, and the receiver can use his private key to decrypt it.
Medical diagnosis involves a multitude of digital video technologies, such as, ultrasound, magnetic resonance imaging, computed
tomography, and positron emission tomography. The diagnostic pictures are communicated and stored extensively for a variety of
specialised functions, including feature selection, data hiding, image denoising, compression, and segmentation [6–8]. Additionally,
a lot of private information pertaining to patients’ privacy is frequently included when medical images are distributed online or via
a hospital intranet. However, hospital intranets lack significant security tools, and the internet also faces significant problems like
malicious interference and data leakage [9]. Encryption of medical image is an effective way to prevent medical images from the
threats [10].
Neural cryptography is a new type of public key cryptography that is not based on number theory and requires less computing
time and memory [11–14]. Neural cryptography can create key exchange protocols based on the synchronisation phenomena in
neural networks rather than the conventional number theory-based encryption [15]. Furthermore, even if an attacker is aware of
the specifics of the algorithm and has access to the communication channel, the neural cryptography ensures that the key cannot be
deduced. Both parties involved in the key exchange protocol can share a secret key by synchronising a shared neural network that
they both share (referred to as a Tree Parity Machine, or TPM).
The parties generate their own TPMs using common parameters and starting random weight values for the TPMs in the standard
neural cryptography [16]. After that, they create a random input vector and compute their own output values by feeding the TPM with
the produced common input. They update their own weight values using a specific learning algorithm by trading the output values.
The initial values assigned to the weight vectors must be kept secret. This is same as it happens in other public key cryptography
systems. The input and output values, however, are discoverable by anyone – this includes adversaries as well. These steps are
repeated until the weight vectors are fully synchronised. After full synchronization, the synchronized weight vectors are used and
shared as the secret key. The keys that have been exchanged can be utilised in several applications, much like the current other public
key cryptography systems. For instance, the key produced by neural cryptography may be utilized in block cyphers like SDES, AES
[17], and Rijndael for encryption [18].
Autoencoders are a subset of the neural network family used in deep learning [19,20]. They are mostly employed in computer
vision [21] and natural language processing (NLP) [22,23]. They can carry out the activities that are necessary for the unsupervised
learning to efficiently learn codings for the unlabelled data. Regenerating the encoder’s input serves to validate encoding in an
autoencoder. A neural network combination called an encoder learns how to represent a collection of data. The primary goal of
unsupervised learning is dimensionality reduction. The intention is to train the network to reject irrelevant data representations by
employing an autoencoder [24].
Data masking is a strategy used to attempt to obscure non-relevant information from the model so that it may be trained on
information more relevant to carrying out any given job. Mask autoencoder can be considered as a process of using mask data with
autoencoders [24]. When used with data encryption, they can provide additional level of data obfuscation by masking certain portions
of the image, thereby making the encryption stronger. In this paper, we have proposed a neural cryptography-based encryption scheme
for sharing medical images. The proposed technique is based on tree parity machine combined with masked auto-encoders and Shamir
Scheme for secret image sharing. It helps to recover the loss in image due to noise during secret sharing of image. We evaluated our
proposed scheme on dataset consisting of CT scans made public by The Cancer Imaging Archive (TCIA) [25].

Method details

In this section, we discuss about the proposed methodologies and the used benchmark dataset for the training and evaluation of
the model.

Dataset description

The benchmark dataset of CT scan image used in this work has been taken from the cancer imaging archive (public repository)
[25]. The description of the dataset is presented in the Table 1. There are a total of 8,89,324 CT scan images which consumes 728.5
GB of memory. Each image has three channels (RGB) having 48 ∗ 48 dimension. The dataset was splitted into training and validation
set as 70:30 ratios. Further, 1% of the validation set are used for evaluation of the proposed model.

Proposed architecture

Here, we have discussed the proposed neural network based end-to-end architecture as shown in Fig. 1. It has mainly two parts (i)
Encryption phase and (ii) Decryption phase. Encryption phase consist of (a) input image, (b) TPM-based generated symmetric key, (c)

2
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Table 1
Dataset description.

S. No. Stats Values

1 Total Image 889,324


2 Training Image 622,527
3 Validation Image 264,130
4 Test Image 2667
5 Image Size 48 ∗ 48 ∗ 3

Fig. 1. Proposed end-to-end framework.

Masked Auto-encoder and (d) Encryption using Shamir Scheme. Further, the Decryption phase consist of (a) Image Reconstruction
from 𝑘 shares, (b) Masked Image decoder, and (c) Output Image. These two phases are described next.

1. Encryption Phase:

This phase encrypts a CT Scan image. The steps followed to encrypt an image are given below.

a. Input image is taken.


b. Symmetric encryption key is generated using TPM.
c. Masked image is generated using deep learning based masked auto-encoder.
d. Shares are generated using masked image.
e. Generated shares are encrypted using the Symmetric encryption key generated using TPM.

These steps are described next.

a. Input Image

This is the starting unit of this proposed architecture as shown in Fig. 1. Here, image of size 48 ∗ 48 ∗ 3 is given as input to this
architecture. There is no requirement of the preprocessing steps like image resize and augmentation, as image is of uniform size and
number of data samples are sufficient for training and testing of the model. The steps just required to convert it into pixel values.

b. TPM-based Symmetric Key generation

Tree Parity Machine (TPM) is a way to share keys that is built on a neural network. It uses the same neural network for both
entities and uses the shared neural network to synchronize the key exchange procedure. The TPM architecture is shown in Fig. 2. and
the algorithm for weight update is described below.

1. Here, ‘𝑘’ number of neurons are considered for the hidden layer and for each ‘𝑘’ neurons, ‘𝑛’ number of neurons are considered at
the input layer. These input neurons are initialized with random values generated between –l to +l.
2. Then, the random weight matrix of size 𝑘 ∗ 𝑛 are generated from the network.
3. Row wise sum is being calculated using the random values and the randomized matrix ‘𝑋’. Then, given as input to both the
machines and after calculating the weights, signum function is applied. The final output is given as sigma [𝑘 × 1].
4. Tau value is calculated on the final output and the expected value can be only −1, 0, +1 for sigma and same for the tau.

Next, we used the Hebbian updation rule for calculating the weight for the TPM. The following steps outline weight update
strategy:

1. For both the entities, separate TPM machine is used.

3
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Fig. 2. Tree Parity Machine.

Fig. 3. Mask Autoencoder architecture. Hyperparameters are {optimizer: AdamW, learning rate: 0.015, weight decay: 0.05, optimizer momentum:
𝛽1 , 𝛽2 = 0.9, 0.95, batch size: 256, dropout rate: 0.2}. Encoder and decoder depths and architectures are same as in [24].

2. After obtaining 1 as output, we check for the similar value on both the sides.
3. If both the values are equal, Hebbian updation rule (discussed next) is applied to update the weights.
4. After updating, weights (matrix) of both the machines are compared. If both are equal, it means a synchronization state has been
achieved and then we can utilize this weight vector as our key.

Hebbian updation rule to update weight matrix is given as:

For 𝑖 in [1, 𝑘] and 𝑗 in [1, 𝑛]


if((𝜎[𝑖] == 𝜏1) && (𝜏1 == 𝜏2))
𝑊 [𝑖, 𝑗 ] + = 𝑋[𝑖, 𝑗 ] ∗ 𝜏1

a. Masked Auto-encoder

This is the main contribution of this proposed end-to-end framework. Here, deep learning vision transformer-based architecture
[24,26] has been used to create the mask image of the input original image. The architecture of the autoencoder is given in Fig. 3.
Here, masking hides some portion of the image and let the model to learn to generate the masked content of image. Masking preserve
integrity of the data within image. Masked autoencoder helps in reconstruction of original images from the masked images. The
architecture consists of the input image and image is converted into nine patches. Then the patch embedding with position embedding
are given as input to the transformer encoder model. The encoder consists of multi-head attention as presented by Vaswani et al.
[27]. Then, the extracted features using neural network are given to the decoder for the generation of the original image as shown
in Fig. 3. From this autoencoder, the masked images are encrypted using the Shamir scheme discussed next.

4
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Fig. 4. Shamir secret key sharing scheme.

a. Encryption using Shamir Scheme


Shamir’s Secret sharing [28] is one of the first secret sharing schemes in cryptography which is based on polynomial interpolation
over finite fields. It works in distributed manner to secure the secret keys and split the keys into multiple parts called shares. The
architecture of this scheme is presented in Fig. 4. As shown, the input images and the corresponding keys introduced by TPM are used
to encrypt the image and splitted into shares. For each image ‘𝑘’ such shares are generated and are passed to the other side (entity).
This algorithm’s primary goal is to split a secret into multiple distinct pieces that need to be encrypted. Let 𝑆 denote the secret
message that needs to be encrypted. The secret is broken up into multiple pieces (𝑆1 , 𝑆2 , 𝑆3 , ..., 𝑆𝑛 ) called shares. These shares are
distributed to different people. The key thing is that even with some shares missing, the secret remains hidden. Now comes the key
point: a number (𝐾) is chosen beforehand, deciding how many shares are required to unlock the secret. We can think of it like needing
a certain number of keys to open a safe. If someone has less than 𝐾 shares (like missing keys), they’re out of luck! The beauty lies in
the fact that with 𝐾 or more shares, the secret message can be retrieved. Special mathematical tricks (based on polynomials) allow
authorized people to piece together the missing information and recover the original secret (𝑆). This is known as a (𝐾, 𝑁 ) threshold
scheme. To illustrate Shamir’s scheme, consider we have a confidential message (𝑆) that we want to securely distribute among a
group of people (𝑁). We also decide how many people (𝐾) need to collaborate to reveal the message (like needing a certain number
of keys to unlock a safe). Shamir’s Secret Sharing achieves this security through a clever mathematical approach:
1. Encoding the Message: We create a special equation (polynomial) that hides the message (S) within its constant term. We can
think of it as disguising the message with other random numbers like coefficients in an algebraic expression.
2. Generating Shares: This equation is then used to generate unique pieces of information (shares) for each person (𝑁 points on
the polynomial). Each share contains some information about the message, but not the entire thing on its own.
3. Minimum to Decode: The key feature is that anyone with at least 𝐾 shares (enough information) can use a specific mathematical
technique (Lagrange interpolation) to reconstruct the original equation (polynomial) and extract the hidden message (𝑆). This
technique cleverly combines the information from the shares.
4. Enhanced Security: Even if someone obtains fewer than 𝐾 shares (incomplete information), they won’t be able to decode the
message (𝑆) because it’s cleverly hidden within the complex structure of the polynomial.
Example:
• Confidential message (𝑆) = Top-secret launch code (code = 65)
• Sharing with 4 people (𝑁)
• Minimum needed to launch (𝐾) = 2 people
We create a mathematical equation where the constant term holds the secret code. Then, we use this equation to generate 4
unique pieces of information (shares) for each person. The beauty lies in the fact that anyone with 2 or more shares can use Lagrange
interpolation (a specific mathematical tool) to recreate the original equation and discover the hidden launch code (65).
b. Decryption phase
After receiving the shares of the images from the other side, each share is decrypted to generate the original image. The steps
followed to reproduce the original image is given as:
1. Decrypt shares using symmetric keys: Here, shares of the images are decrypted using the symmetric key generated by the TPM
as shown in Fig. 4 and the reconstruction of the image is done through Lagrange basis Polynomial. This polynomial keeps track of
the sequence of the shares and corresponding keys as shown in Fig. 4. As output, this step provides the decrypted masked image,
which is given to masked image decoder module to generate the original image.

5
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

2. Generate masked image using k-shares: The masked decoder used for generating the image is a part of the autoencoder. The
capability of masked decoder to generate the original image from masked image is all due to its train weight, trained in end-to-end
framework. As output of this layer, the original image is generated.
3. Apply masked decoder to get the original image: At this step, the original image is given as output, which is as equivalent as
original image.

Method validation

In this section, we discussed about the experimental results and analysis of the proposed end-to-end framework. All the experiments
are performed on Intel(R) CPU v4 @ 2.20 GHz, 64 GB RAM machine using Anaconda with Python 3.9. The proposed framework
can be used with text, number, and images. For the experimental analysis, RGB image as described in Section 2.1 is used. For
detailed analysis, different commonly used evaluation metrics are evaluated on the original, intermediate and reproduced images.
The evaluation metrics are discussed next.

a. Correlation

It measures how two images (secret image and recovered image) are related to each other. The correlation values lie between the
−1 to +1, where −1 indicates two images are opposite to each other and +1 indicates both the image are equivalent. The correlation
coefficient ‘𝑟’ is given as:
∑ ∑ ( )(
̄ 𝐵𝑚𝑛 − 𝐵̄
)
𝑚 𝑛 𝐴𝑚𝑛 − 𝐴
𝑟 = √( (1)
∑ ∑ ( )2 )(∑ ∑ ( )2 )
̄ ̄
𝑚 𝑛 𝐴𝑚𝑛 − 𝐴 𝑚 𝑛 𝐵𝑚𝑛 − 𝐵

Where, 𝐴 denotes the secret image matrix and 𝐵 denotes the recovered image matrix and 𝐴̄ and 𝐵̄ are means of 𝐴 and 𝐵,
respectively.

b. RMSE

Root mean square error is the square root of mean of the square of all the errors. It determines the different in quality between
two images. Lower the RMSE, higher the similar image. It can be calculated as:

√ 𝑚 𝑛
√1 ∑∑
𝑅𝑀𝑆𝐸 = √ (𝑋(𝑖, 𝑗 ) − 𝑌 (𝑖, 𝑗 ))2 (2)
𝑏 𝑖=1 𝑗=1

Where, 𝑚 × 𝑛 is the dimension of image and 𝑋(𝑖, 𝑗 ) is the pixel value at (𝑖, 𝑗 ) for the first image; 𝑌 (𝑖, 𝑗 ) pixel value at (𝑖, 𝑗 ) pixel
value at (𝑖, 𝑗 ) for the second image.

c. PSNR

Peak Signal Noise Ratio is used to measure the quality of images and is measured in decibel (dB) units. PSNR >= 20 dB indicates
good quality. The higher is the PSNR, the lower the error value. PSNR is given as:
𝑁2
𝑃 𝑆𝑁𝑅 = 10 log10 (3)
𝑀𝑆𝐸
Where, N is the maximum fluctuation in the input data.
An input image to be encrypted using our proposed approach is shown in Fig. 5(a). The shares S1, S2, S3, S4, S5, and S6 are
depicted in Figs. 5(b)-(g) and were created using Shamir’s (𝑘, 𝑛) threshold technique with 𝑘 = 6 and 𝑛 = 6. We used the neural key
exchange protocol that was proposed and encrypted the created shares using keys and shares. These shares must now be divided
among several people. The suggested neural key exchange mechanism enters the picture at this point.
Using the neural key protocol as described, a unique synchronised key will be created for each individual and distributor. The
distributor will ultimately use the key created during the TPM synchronisation procedure with each individual to encrypt the share
that he wants to send to that individual. Boolean XOR operation is used to perform the encryption. The relevant individual can be
sent the encrypted shares safely. Depending on the secret sharing policy, each person will either have one or both of these encrypted
shares. By once more conducting a Boolean XOR operation of the encrypted share and the appropriate key, they will be able to
decode the shares. The encoded image being fed to the masked auto-encoder decoder layers is shown in Fig. 5-(g). The decrypted
and decoded output image generated from the masked auto-encoder decoder layer is shown in Fig. 5-(h), which is the final output of
the proposed framework. The Lagrange Interpolation was used to recreate the original image from these shares.

Quantitative analysis original image, intermediate image and output image

Here, in this section we compared amongst images used and generated at different stages. The comparison is done by calculating
different metric as discussed earlier. Table 2 shows analysis of input images and encrypted shares and Table 3 shows analysis of input
image and output image. As in Table 2, correlation values are almost zero, which shows both the image are different and cannot
be traced in between of the layer even after having the TPM symmetric key. So, the proposed framework provides an additional

6
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Fig. 5. Experimental result of the proposed scheme on randomly selected image I_1: (a) Secret image. (b)–(g) Shares are generated using Shamir’s
secret sharing. (h) Encoded image being passed to the masked autoencoder decoder. (i) Decrypted Image.

7
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Table 2
Quantitative analysis of input image and encrypted shares for Image 1.

Shares PSNR RMSE Correlation

1 7.846555514 103.3266632 −0.00617044


2 7.846622072 103.3258715 −0.02698148
3 7.846612877 103.3259808 −0.01170099
4 7.846694404 103.325011 −0.00440447
5 7.84662472 103.32584 −0.00794402
6 7.846868173 103.3229439 −0.00149419

Table 3
Quantitative analysis of input image and output image.

Input Image PSNR RMSE Correlation

1 28.70115444 9.36445394 0.95760909

Fig. 6. Quantitative analysis on 10,000 samples: average PSNR values between the input image and the encrypted shares. The red line denotes the
mean value.

level of security. Table 2 shows the negative correlation values for all the shares of image, which signifies that all the encrypted
shares are totally different than the original. Moreover, PSNR values is less than eight, which indicate lower quality of image in
comparison to original image and the same is signified by higher RMSE values. Additionally, Table 3 shows the analysis between the
input and output images. The values for different metric show the equivalency between the image and is also shown by the Fig. 4.
The correlation value is almost equal to 1 and PSNR is greater than 20 whereas, RMSE is low as less than 10, which signifies the
comparable quality of the output image.
In spite of this, we accept that our results may not be acceptable for the most critical of cases where an absolute lossless encryption
is must. To this end we would like to bring the following to the readers’ notice. Our approach combines neural cryptography-based
Shamir’s scheme and masked auto-encoders. Each of these approaches can be used as an encryption scheme itself. Specially, the second
part – masked autoencoders – being a soft computing deep learning approach don’t always guarantee to produce 100% accuracy.
That is, the original image will be exactly reproduced as it is, is not guaranteed always. Other encryption schemes are typically
deterministic in nature guaranteeing to reproduce the original input as it is. However, the advantage our proposed approach offers is
that (i) Neural cryptography is a new type of public key cryptography that is not based on number theory and requires less computing
time and memory, (ii) masked auto-encoders provide additional level of obfuscation through their deep learning architecture. This is
evident from Table 3 values that we have added. They show quantitative analysis of Input Images and Encrypted Shares. Table 3 shows
very high value of RMSE between the input image and encrypted share and extremely low value of correlation. This demonstrates
the utility of our proposed approach. Even if approach is not absolutely perfect in terms of delivering absolute lossless compression,
it exploits the advantage of neural cryptography and lays down foundation for further research in exploiting neural cryptography
and masked auto-encoders for image encryption.
We extended the above analysis for 10000 images randomly selected from the test dataset. Figs. 6–8 show the PSNR, RMSE, and
Correlation values, respectively, averaged between the input image and the shares of the 10000 images. In each figure, the line in

8
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Fig. 7. Quantitative analysis on 10,000 samples: average RMSE values between the input image and the encrypted shares. The blue line denotes
the mean value.

Fig. 8. Quantitative analysis on 10,000 samples: average correlation values between the input image and the encrypted shares. The orange line
denotes the mean value.

the middle denotes the average value. High RMSE values and almost zero correlation values indicate that input and encoded images
are different and cannot be traced in between of the layer even after having the TPM symmetric key. So, the proposed framework
provides an additional level of security.
Figs. 9–11 show the PSNR, RMSE, and Correlation values, respectively, between the input image and the reconstructed output
image on the 10000 images. The values for different metric show the equivalency between the image and is also shown by the Fig. 4.
The correlation value is almost equal to 1 and PSNR is greater than 20 whereas, RMSE is low as less than 10, which signifies the
comparable quality of the output image.

Comparison with existing works

Here we have compared our proposed approach with two existing techniques [29,30]. Chen et al. [30] presented a method for
splitting a black and white secret image into multiple shares that, when overlapped a certain number (𝑘) together, reveal the original
image. This technique relies on randomly dividing the secret image into grids. A special program then picks random grids one by one
and hides a single black pixel within each chosen grid. The key feature lies in how these black pixels are hidden. Only by stacking
enough grids (𝑘 or more) will the black pixels from each grid overlap and disclose the secret image. Any fewer than 𝑘 grids will provide

9
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Fig. 9. Quantitative analysis on 10,000 samples: PSNR values between the input image and the reconstructed output image. The red line denotes
the mean value.

Fig. 10. Quantitative analysis on 10,000 samples: RMSE values between the input image and the reconstructed output image. The blue line denotes
the mean value.

Table 4
Comparison with existing techniques. Quantitative analysis between the input image and
the encrypted shares. Reported values were averaged over the set of 10,000 images.

Method PSNR RMSE Correlation

Chen et al. [25] 7.1302 83.4388 0.00005


Gupta et al. [29] 12.1501 91.6422 0.00340
Proposed 8.0031 109.8524 0.00027

no information about the hidden image. Gupta et al. [29] present a secure image sharing scheme that utilizes Shamir’s secret sharing
and neural cryptography for key generation. Neural cryptography offers a computationally efficient alternative to traditional public
key cryptography, making it suitable for resource-constrained environments. The proposed method ensures secure image sharing
without leakage of secret information during transmission over a public channel.
The comparison was done on the above set of 10,000 images randomly selected from the test dataset. The comparison was done
on (i) average PSNR, RMSE, and correlation values between input image and the encrypted shares (see Table 4), and (ii) PSNR, RMSE,
and correlation values between the input image and the reconstructed output image (see Table 5). It can be seen that in same cases,

10
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Fig. 11. Quantitative analysis on 10,000 samples: correlation values between the input image and the reconstructed output image. The orange line
denotes the mean value.

Table 5
Comparison with existing techniques. Quantitative analysis between the input image and
reconstructed output image. Reported values were averaged over the set of 10,000 images.

Method PSNR RMSE Correlation

Chen et al. [25] 51.1153 9.7302 97.8289


Gupta et al. [29] 28.1202 7.5482 98.3477
Proposed 26.8841 8.8987 97.0009

the proposed method does better than the other techniques, e.g., in Table 4 it has a lower correlation than Gupta et al.’s [29] method
and a higher RMSE than both Gupta et al.’s [29] and Chen et al.’s [30] methods when comparing the input image with the encrypted
shares. Overall, the proposed method does not manage to outperform existing approaches with respect to all the metrics. However,
other encryption schemes are typically deterministic in nature guaranteeing to reproduce the original input as it is. Whereas, the
advantage our proposed approach offers is that (i) Neural cryptography is a new type of public key cryptography that is not based
on number theory, requires less computing time and memory and is non-deterministic in nature, (ii) masked auto-encoders provide
additional level of obfuscation through their deep learning architecture. The proposed approach exploits the advantage of neural
cryptography and lays down foundation for further research in exploiting neural cryptography and masked auto-encoders for image
encryption.
Next, we do security analysis of the proposed technique. These analyses in part have been taken from [30].

Security analysis of Shamir’s scheme

Let us take 𝑇 = 25 as the security text, i.e., the text we want to encrypt. Next, we split the security text 𝑇 = 25 into 𝑝 = 6 pieces
(or shares). This split is such that any subset of 𝑘 = 3 shares is sufficient to reconstruct the security text 𝑇 . Next, choose two random
numbers, say 𝑛1 = 20 and 𝑛2 = 30. Then, our security polynomial is:
𝑔 (𝑥) = 𝑛0 + 𝑛1 𝑦 + 𝑛2 𝑦2 (4)
Where, 𝑛0 = 𝑇 = 25 is the security text. Then:
𝑔 (𝑥) = 25 + 20𝑦 + 30𝑦2 (5)
Next, we build 𝑝 = 6 points 𝐺𝑦−1 = (𝑦, 𝑔(𝑦)) from the polynomial:
𝐺0 = (1, 75) (6)

𝐺1 = (2, 185) (7)

𝐺2 = (3, 355) (8)

𝐺3 = (4, 585) (9)

11
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

𝐺4 = (5, 875) (10)

𝐺5 = (6, 1225) (11)

Now, the greater number of these shares an attacker finds, the more information he has to reconstruct the security text 𝑇 . Let
us consider that the attacker is able to crack two of these shares, i.e., 𝐺0 = (1, 75) and 𝐺1 = (2, 185), but fails to find the third point
(share), which 𝑘 = 3 is sufficient to reconstruct the security text 𝑇 . Nonetheless, the attacker tries to combine the information from
these two points 𝐺0 = (1, 75) and 𝐺1 = (2, 185), with the information in public domain and computes as follows:

𝑝 = 6, 𝑘 = 3
𝑔(𝑦) = 𝑛0 + 𝑛1 𝑦 + ⋯ + 𝑛𝑘−1 𝑦𝑘−1
𝑛0 = 𝑇 = 25, 𝑛𝑖 ∈ ℕ (12)

1. Compute 𝑔(𝑦) with 𝑇 and the values of 𝑘.

𝑔 (𝑦) = 𝑇 + 𝑛1 𝑥 + ⋯ + 𝑛3−1 𝑦3−1


𝑔 (𝑦) = 𝑇 + 𝑛1 𝑦 + 𝑛2 𝑦2 (13)

2. Using the value of 𝐺0 , the following equation is produced:

75 = 𝑇 + 𝑛1 1 + 𝑛2 12
75 = 𝑇 + 𝑛1 + 𝑛2 (14)

3. Using the value of 𝐺1 , the following equation is produced:

185 = 𝑇 + 𝑛1 2 + 𝑛2 22
185 = 𝑇 + 2𝑛1 + 4𝑛2 (15)

4. Subtracting the above Eqs. (15)-(14):


( ) ( )
(185 − 75) = (𝑇 − 𝑇 ) + 2𝑛1 − 𝑛1 + 4𝑛2 − 𝑛2
110 = 𝑛1 + 3𝑛2 (16)

This can be rewritten as:


𝑛1 = 110 − 3𝑛2 (17)

5. It is known that the attacker is aware of the fact that 𝑛2 ∈ ℕ. The attacker tries out substituting all possible values of 𝑛2 in Eq. (14),
i.e., 0, 1, 2, 3, …, to compute the corresponding possible values of 𝑛1 :

𝑓 𝑜𝑟𝑛2 = 0; 𝑛1 = 110 − 3 × 0 = 110


𝑓 𝑜𝑟𝑛2 = 1; 𝑛1 = 110 − 3 × 1 = 107
𝑓 𝑜𝑟𝑛2 = 2; 𝑛1 = 110 − 3 × 2 = 104
… (18)
𝑓 𝑜𝑟𝑛2 = 35; 𝑛1 = 110 − 3 × 35 = 5
𝑓 𝑜𝑟 𝑛2 = 36; 𝑛1 = 448 − 3 × 36 = 2

6. It must be noted that if the attacker continues as per the above and considers 𝑛2 = 36, a negative value will be obtained for 𝑛1 .
This is not possible because 𝑛1 ∈ ℕ. Therefore, the attacker must stop and he concludes that 𝑛2 ∈ [0, 1, … , 35, 36] must be true.
7. Next, the attacker can substitute 𝑛1 by Eq. (17) in (15) to get the following:
( )
75 = 𝑇 + 110 − 3𝑛2 + 𝑛2
𝑜𝑟, 𝑇 = 2𝑛2 − 35 (19)

8. Using the possible values of 𝑛2 and 𝑛1 from Eq. (18), the attacker computes the possible values of 𝑇 as follows:
𝑇 ∈ [−35 + 2 × 0, −35 + 2 × 1, … , −35 + 2 × 35, −35 + 2 × 36] (20)

The above gives:


𝑇 ∈ [−35, −33, ..., 35, 37] (21)
With the above, only 37 numbers remain to be considered; and not naïve set of infinite number of natural numbers.

12
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Security analysis of neural key exchange protocol (NKEP)

In a brute force attack on NKEP, the attacker attempts to guess all possible weights. The weights are analogous to keys in other
encryption systems. Successful decryption requires the attacker to discover all the weights. If the attacker possesses TPMs identical
to those used by the sender and receiver, an attack may be feasible. The attacker could synchronize their TPM weights with those of
the sender and receiver. In the subsequent discussion, let 𝑆, 𝑅, and 𝐴 denote a sender, receiver and an attacker, respectively.

1. If the outputs of sender (S) and receiver (R) do not match, the weights in their respective TPMs remain unchanged.
2. In contrast, if the outputs of S, R, and attacker (A) are identical, all three parties update the weights in their TPMs.
3. However, if the outputs of S and R match but A’s output differs, only S and R update their TPMs. This scenario arises when the
learning rates of S and R exceed that of A.

A study in [30] revealed that an attacker can only modify 16% of the weights in NKEP on average. This percentage can be further
reduced by increasing the synaptic depth of the neural network. However, increasing synaptic depth directly increases computational
costs. Therefore, breaking the security of NKEP is computationally challenging due to its classification as an NP-hard problem.

Security analysis of proposed technique

The proposed technique employs Shamir’s Secret Sharing Scheme to generate secret shares using integer arithmetic. These shares
are then distributed securely using the Neural Key Exchange Protocol (NKEP). The generated shares are further encrypted with a
Neural Key. Now, consider one of these secret shares, and let’s assume it needs to be transmitted securely between two parties, 𝐴 and
𝐵. 𝐴 and 𝐵 establish a shared Neural Key by synchronizing their weights. However, during this synchronization process, two types
of attacks are possible:

a. Brute force attack

In a brute-force attack, the attacker must exhaustively test all possible weight combinations. Since our key is represented by a
𝑘 × 𝑛 matrix, and after synchronization, each matrix element can only assume two values, +𝐿 or −𝐿, the total number of possible
combinations is 2𝑘𝑛 , resulting in non-polynomial time complexity.

b. Attempt to sync with A and B by a third party using its TPM

Since the Tree Parity Machines (TPMs) of parties 𝐴 and 𝐵 generate binary outputs (𝜏(−1, 1)), they publicly share their outputs,
compare them, and synchronize their weights accordingly. Now, suppose a third party, 𝐶, also generates an output and attempts to
synchronize their TPM with those of 𝐴 and 𝐵. 𝐴 and 𝐵 will synchronize their weights twice as fast as 𝐶, rendering 𝐶 unable to obtain
the Neural Key.
To illustrate this, consider the eight possible outcomes from the three TPMs (𝐴, 𝐵, and 𝐶): (1, 1, 1), (1, 1, −1), (1, −1, −1), (1, −1, 1),
(−1, −1, −1), (−1, −1, 1), (−1, 1, 1), and (−1, 1, −1). 𝐶 can only synchronize at two of these outcomes: (1, 1, 1) and (−1, −1, −1), with
a probability of 41 . In contrast, 𝐴 and 𝐵 have four possible outcomes: (1, 1), (1, −1), (−1, 1), and (−1, −1), out of which they can
synchronize at two: (1, 1) and (−1, −1), with a probability of 12 .
Therefore, 𝐴 and 𝐵’s shared learning process is significantly faster than 𝐶’s individual learning process, effectively preventing 𝐶
from gaining access to the Neural Key.

Conclusions

In order to protect patient data, it is crucial to share medical photos securely. In this work, we provide a neural cryptography-based
encryption system for exchanging medical images. The suggested method for secret picture sharing is based on a tree parity machine
paired with masked auto-encoders. It aids in restoring picture loss brought on by noise when images are shared secretly. A novel kind
of public key cryptography called neural cryptography is used here. It is less memory and processing intensive since it is not reliant
on number theory. The dataset we used for our proposed scheme’s evaluation consisted of CT scans made available to the public by
The Cancer Imaging Archive (TCIA) [25]. The experimental findings were highly encouraging. In future, we intend to evaluate our
proposed framework on larger datasets and compare with contemporary methods.

Ethics statements

The experiments of this work do not involve human subjects or animals. Further, no data was collected from social media platforms.
The proposed scheme was evaluated on the dataset consisting of CT scans made public by The Cancer Imaging Archive (TCIA) [25].

Declaration of generative AI and AI-assisted technologies in the writing process

During the revision of this manuscript the authors used Google Gemini in order to grammatically correct and rephrase certain
parts of the text for better understanding. After using this tool/service, the authors reviewed and edited the content as needed and
take full responsibility for the content of the publication.

13
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to
influence the work reported in this paper.

CRediT authorship contribution statement

Kishore Kumar: Conceptualization, Methodology, Software, Writing – original draft. Sarvesh Tanwar: Supervision, Writing –
review & editing. Shishir Kumar: Supervision, Writing – review & editing.

Acknowledgments

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References

[1] V. Shaik, K. Dr. Natarajan, Flexible and cost-effective cryptographic encryption algorithm for securing unencrypted database files at rest and in transit, MethodsX
9 (2022) 101924.
[2] B. Zolfaghari, T. Koshiba, The dichotomy of neural networks and cryptography: war and peace, Appl. Syst. Innov. 5 (2022) 61.
[3] G. Ravikumar, K. Venkatachalam, M.A. AlZain, M. Masud, M. Abouhawwash, Neural Cryptography with Fog Computing Network for Health Monitoring Using
IoMT, Comput. Syst. Sci. Eng. 44 (2023) 945–959.
[4] M.U. Bokhari, Q.M. Shallal, A review on symmetric key encryption techniques in cryptography, Int. J. Comput. Appl. 147 (2016).
[5] S. Chandra, S. Paira, S.S. Alam, G. Sanyal, A comparative survey of symmetric and asymmetric key cryptography, 2014 international conference on electronics,
communication and computational engineering (ICECCE), 2014.
[6] S. Mitra, B.U. Shankar, Medical image analysis for cancer management in natural computing framework, Inf. Sci. (Ny) 306 (2015) 111–131.
[7] M. Kidoh, K. Shinoda, M. Kitajima, K. Isogawa, M. Nambu, H. Uetani, K. Morita, T. Nakaura, M. Tateishi, Y. Yamashita, others, Deep learning based noise
reduction for brain MR imaging: tests on phantoms and healthy volunteers, Magnetic Res. Med. Sci. 19 (2020) 195.
[8] X. Liu, L. Song, S. Liu, Y. Zhang, A review of deep-learning-based medical image segmentation methods, Sustainability. 13 (2021) 1224.
[9] Y. Chen, C. Tang, R. Ye, Cryptanalysis and improvement of medical image encryption using high-speed scrambling and pixel adaptive diffusion, Signal. Processing.
167 (2020) 107286.
[10] Y. Wu, L. Zhang, S. Berretti, S. Wan, Medical image encryption by content-aware dna computing for secure healthcare, IEEe Trans. Industr. Inform. 19 (2022)
2089–2098.
[11] T. Dong, T. Huang, Neural cryptography based on complex-valued neural network, IEEe Trans. Neural Netw. Learn. Syst. 31 (2019) 4999–5004.
[12] S. Jeong, C. Park, D. Hong, C. Seo, N. Jho, Neural cryptography based on generalized tree parity machine for real-life systems, Security and communication
networks 2021 (2021) 1–12.
[13] L.K.L. Ng, S.S.M. Chow, Sok: cryptographic neural-network computation, 2023 IEEE Symposium on Security and Privacy (SP), 2023.
[14] F. Aliabadi, M.-H. Majidi, S. Khorashadizadeh, Chaos synchronization using adaptive quantum neural networks and its application in secure communication and
cryptography, Neural computing and applications 34 (2022) 6521–6533.
[15] I. Kanter, W. Kinzel, E. Kanter, Secure exchange of information by synchronization of neural networks, Europhys. Lett. 57 (2002) 141.
[16] J. Wu, W. Xia, G. Zhu, H. Liu, L. Ma, J. Xiong, Image encryption based on adversarial neural cryptography and SHA controlled chaos, J. Mod. Opt. 68 (2021)
409–418.
[17] S.S. Dhanda, P. Jindal, B. Singh, D. Panwar, A compact and efficient AES-32GF for encryption in small IoT devices, MethodsX. 11 (2023) 102491.
[18] J. Daemen, V. Rijmen, “AES proposal: rijndael,” 1999.
[19] A.M. Shiddiqi, E.D. Yogatama, D.A. Navastara, Resource-aware video streaming (RAViS) framework for object detection system using deep learning algorithm,
MethodsX. 11 (2023) 102285.
[20] P. Kamat, S. Kumar, S. Patil, K. Kotecha, Anomaly-informed remaining useful life estimation (AIRULE) of bearing machinery using deep learning framework,
MethodsX. 12 (2024) 102555.
[21] V. Agrawal, J. Jagtap, S. Patil, K. Kotecha, Performance analysis of hybrid deep learning framework using a vision transformer and convolutional neural network
for handwritten digit recognition, MethodsX. 12 (2024) 102554.
[22] C.-Y. Liou, W.-C. Cheng, J.-W. Liou, D.-R. Liou, Autoencoder for words, Neurocomputing. 139 (2014) 84–96.
[23] C.W. Ko, J. Huh, J.-W. Park, Deep learning program to predict protein functions based on sequence information, MethodsX. 9 (2022) 101622.
[24] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, R. Girshick, Masked autoencoders are scalable vision learners, in: Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, 2022.
[25] K. Clark, B. Vendt, K. Smith, J. Freymann, J. Kirby, P. Koppel, S. Moore, S. Phillips, D. Maffitt, M. Pringle, others, The Cancer Imaging Archive (TCIA): maintaining
and operating a public information repository, J. Digit. Imaging 26 (2013) 1045–1057.
[26] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly and others, “An image is worth
16x16 words: transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
[27] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Adv. Neural Inf. Process. Syst. 30
(2017).
[28] A. Shamir, How to share a secret, Commun ACM 22 (1979) 612–613.
[29] T.-H. Chen, Y.-S. Lee, W.-L. Huang, J.S.-T. Juan, Y.-Y. Chen, M.-J. Li, Quality-adaptive visual secret sharing by random grids, J. Syst. Software 86 (2013)
1267–1274.
[30] M. Gupta, M. Gupta, M. Deshmukh, Single secret image sharing scheme using neural cryptography, Multimed. Tools. Appl. 79 (2020) 12183–12204.

14

You might also like