2024_MAN–C_A Masked Autoencoder Neural Cryptography Based Encryption Scheme for CT Scan Images
2024_MAN–C_A Masked Autoencoder Neural Cryptography Based Encryption Scheme for CT Scan Images
MethodsX
journal homepage: www.elsevier.com/locate/methodsx
a r t i c l e i n f o a b s t r a c t
Method name: Sharing medical images securely is very important towards keeping patients’ data confidential.
MAN–C: a Masked Autoencoder Neural In this paper we propose MAN–C: a Masked Autoencoder Neural Cryptography based encryp-
Cryptography based Encryption Scheme for CT tion scheme for sharing medical images. The proposed technique builds upon recently proposed
Scan Images
masked autoencoders. In the original paper, the masked autoencoders are used as scalable self-
Keywords:
supervised learners for computer vision which reconstruct portions of originally patched images.
Secret sharing Here, the facility to obfuscate portions of input image and the ability to reconstruct original images
Auto encoder is used an encryption-decryption scheme. In the final form, masked autoencoders are combined
Tree parity machine with neural cryptography consisting of a tree parity machine and Shamir Scheme for secret image
Hebbian learning sharing. The proposed technique MAN–C helps to recover the loss in image due to noise during
Image encryption secret sharing of image.
Masked transformer
Neural cryptography • Uses recently proposed masked autoencoders, originally designed as scalable self-supervised
learners for computer vision, in an encryption-decryption setup.
• Combines autoencoders with neural cryptography - the advantage our proposed approach
offers over existing technique is that (i) Neural cryptography is a new type of public key cryp-
tography that is not based on number theory, requires less computing time and memory and is
non-deterministic in nature, (ii) masked auto-encoders provide additional level of obfuscation
through their deep learning architecture.
• The proposed scheme was evaluated on dataset consisting of CT scans made public by The
Cancer Imaging Archive (TCIA). The proposed method produces better RMSE values between
the input the encrypted image and comparable correlation values between the input and the
output image with respect to the existing techniques.
Specifications Table
Subject area: Computer Science
More specific subject area: Cryptography, Deep Learning
Name of your method: MAN–C: a Masked Autoencoder Neural Cryptography based Encryption Scheme for CT Scan Images
Name and reference of original method: Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick; Masked Autoencoders Are
Scalable Vision Learners, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition (CVPR), 2022, pp. 16,000–16,009. https://ieeexplore.ieee.org/document/9879206
Resource availability: https://imaging.cancer.gov/informatics/cancer_imaging_archive.htm
https://github.com/facebookresearch/mae
https://github.com/oke-aditya/neural_encryption_networks
∗
Corresponding author.
E-mail address: [email protected] (K. Kumar).
https://doi.org/10.1016/j.mex.2024.102738
Received 1 February 2024; Accepted 27 April 2024
Available online 28 April 2024
2215-0161/© 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC license
(http://creativecommons.org/licenses/by-nc/4.0/)
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Background
One of the main goals in the modern world is to protect the confidentiality of information from unauthorized access. This has led
to the development of numerous data encryption techniques [1–3]. Symmetric key encryption and asymmetric key encryption are
two broad levels of encryption techniques. The symmetric key encryption technique [4] uses a shared key that is only known by the
sender and receiver. On the other hand, in the asymmetric key encryption technique [5], the receiver has two keys: a private key and
a public key. The private key is known only to the receiver, while the public key is known to everyone. The sender uses the receiver’s
public key to encrypt the message, and the receiver can use his private key to decrypt it.
Medical diagnosis involves a multitude of digital video technologies, such as, ultrasound, magnetic resonance imaging, computed
tomography, and positron emission tomography. The diagnostic pictures are communicated and stored extensively for a variety of
specialised functions, including feature selection, data hiding, image denoising, compression, and segmentation [6–8]. Additionally,
a lot of private information pertaining to patients’ privacy is frequently included when medical images are distributed online or via
a hospital intranet. However, hospital intranets lack significant security tools, and the internet also faces significant problems like
malicious interference and data leakage [9]. Encryption of medical image is an effective way to prevent medical images from the
threats [10].
Neural cryptography is a new type of public key cryptography that is not based on number theory and requires less computing
time and memory [11–14]. Neural cryptography can create key exchange protocols based on the synchronisation phenomena in
neural networks rather than the conventional number theory-based encryption [15]. Furthermore, even if an attacker is aware of
the specifics of the algorithm and has access to the communication channel, the neural cryptography ensures that the key cannot be
deduced. Both parties involved in the key exchange protocol can share a secret key by synchronising a shared neural network that
they both share (referred to as a Tree Parity Machine, or TPM).
The parties generate their own TPMs using common parameters and starting random weight values for the TPMs in the standard
neural cryptography [16]. After that, they create a random input vector and compute their own output values by feeding the TPM with
the produced common input. They update their own weight values using a specific learning algorithm by trading the output values.
The initial values assigned to the weight vectors must be kept secret. This is same as it happens in other public key cryptography
systems. The input and output values, however, are discoverable by anyone – this includes adversaries as well. These steps are
repeated until the weight vectors are fully synchronised. After full synchronization, the synchronized weight vectors are used and
shared as the secret key. The keys that have been exchanged can be utilised in several applications, much like the current other public
key cryptography systems. For instance, the key produced by neural cryptography may be utilized in block cyphers like SDES, AES
[17], and Rijndael for encryption [18].
Autoencoders are a subset of the neural network family used in deep learning [19,20]. They are mostly employed in computer
vision [21] and natural language processing (NLP) [22,23]. They can carry out the activities that are necessary for the unsupervised
learning to efficiently learn codings for the unlabelled data. Regenerating the encoder’s input serves to validate encoding in an
autoencoder. A neural network combination called an encoder learns how to represent a collection of data. The primary goal of
unsupervised learning is dimensionality reduction. The intention is to train the network to reject irrelevant data representations by
employing an autoencoder [24].
Data masking is a strategy used to attempt to obscure non-relevant information from the model so that it may be trained on
information more relevant to carrying out any given job. Mask autoencoder can be considered as a process of using mask data with
autoencoders [24]. When used with data encryption, they can provide additional level of data obfuscation by masking certain portions
of the image, thereby making the encryption stronger. In this paper, we have proposed a neural cryptography-based encryption scheme
for sharing medical images. The proposed technique is based on tree parity machine combined with masked auto-encoders and Shamir
Scheme for secret image sharing. It helps to recover the loss in image due to noise during secret sharing of image. We evaluated our
proposed scheme on dataset consisting of CT scans made public by The Cancer Imaging Archive (TCIA) [25].
Method details
In this section, we discuss about the proposed methodologies and the used benchmark dataset for the training and evaluation of
the model.
Dataset description
The benchmark dataset of CT scan image used in this work has been taken from the cancer imaging archive (public repository)
[25]. The description of the dataset is presented in the Table 1. There are a total of 8,89,324 CT scan images which consumes 728.5
GB of memory. Each image has three channels (RGB) having 48 ∗ 48 dimension. The dataset was splitted into training and validation
set as 70:30 ratios. Further, 1% of the validation set are used for evaluation of the proposed model.
Proposed architecture
Here, we have discussed the proposed neural network based end-to-end architecture as shown in Fig. 1. It has mainly two parts (i)
Encryption phase and (ii) Decryption phase. Encryption phase consist of (a) input image, (b) TPM-based generated symmetric key, (c)
2
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Table 1
Dataset description.
Masked Auto-encoder and (d) Encryption using Shamir Scheme. Further, the Decryption phase consist of (a) Image Reconstruction
from 𝑘 shares, (b) Masked Image decoder, and (c) Output Image. These two phases are described next.
1. Encryption Phase:
This phase encrypts a CT Scan image. The steps followed to encrypt an image are given below.
a. Input Image
This is the starting unit of this proposed architecture as shown in Fig. 1. Here, image of size 48 ∗ 48 ∗ 3 is given as input to this
architecture. There is no requirement of the preprocessing steps like image resize and augmentation, as image is of uniform size and
number of data samples are sufficient for training and testing of the model. The steps just required to convert it into pixel values.
Tree Parity Machine (TPM) is a way to share keys that is built on a neural network. It uses the same neural network for both
entities and uses the shared neural network to synchronize the key exchange procedure. The TPM architecture is shown in Fig. 2. and
the algorithm for weight update is described below.
1. Here, ‘𝑘’ number of neurons are considered for the hidden layer and for each ‘𝑘’ neurons, ‘𝑛’ number of neurons are considered at
the input layer. These input neurons are initialized with random values generated between –l to +l.
2. Then, the random weight matrix of size 𝑘 ∗ 𝑛 are generated from the network.
3. Row wise sum is being calculated using the random values and the randomized matrix ‘𝑋’. Then, given as input to both the
machines and after calculating the weights, signum function is applied. The final output is given as sigma [𝑘 × 1].
4. Tau value is calculated on the final output and the expected value can be only −1, 0, +1 for sigma and same for the tau.
Next, we used the Hebbian updation rule for calculating the weight for the TPM. The following steps outline weight update
strategy:
3
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Fig. 3. Mask Autoencoder architecture. Hyperparameters are {optimizer: AdamW, learning rate: 0.015, weight decay: 0.05, optimizer momentum:
𝛽1 , 𝛽2 = 0.9, 0.95, batch size: 256, dropout rate: 0.2}. Encoder and decoder depths and architectures are same as in [24].
2. After obtaining 1 as output, we check for the similar value on both the sides.
3. If both the values are equal, Hebbian updation rule (discussed next) is applied to update the weights.
4. After updating, weights (matrix) of both the machines are compared. If both are equal, it means a synchronization state has been
achieved and then we can utilize this weight vector as our key.
a. Masked Auto-encoder
This is the main contribution of this proposed end-to-end framework. Here, deep learning vision transformer-based architecture
[24,26] has been used to create the mask image of the input original image. The architecture of the autoencoder is given in Fig. 3.
Here, masking hides some portion of the image and let the model to learn to generate the masked content of image. Masking preserve
integrity of the data within image. Masked autoencoder helps in reconstruction of original images from the masked images. The
architecture consists of the input image and image is converted into nine patches. Then the patch embedding with position embedding
are given as input to the transformer encoder model. The encoder consists of multi-head attention as presented by Vaswani et al.
[27]. Then, the extracted features using neural network are given to the decoder for the generation of the original image as shown
in Fig. 3. From this autoencoder, the masked images are encrypted using the Shamir scheme discussed next.
4
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
5
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
2. Generate masked image using k-shares: The masked decoder used for generating the image is a part of the autoencoder. The
capability of masked decoder to generate the original image from masked image is all due to its train weight, trained in end-to-end
framework. As output of this layer, the original image is generated.
3. Apply masked decoder to get the original image: At this step, the original image is given as output, which is as equivalent as
original image.
Method validation
In this section, we discussed about the experimental results and analysis of the proposed end-to-end framework. All the experiments
are performed on Intel(R) CPU v4 @ 2.20 GHz, 64 GB RAM machine using Anaconda with Python 3.9. The proposed framework
can be used with text, number, and images. For the experimental analysis, RGB image as described in Section 2.1 is used. For
detailed analysis, different commonly used evaluation metrics are evaluated on the original, intermediate and reproduced images.
The evaluation metrics are discussed next.
a. Correlation
It measures how two images (secret image and recovered image) are related to each other. The correlation values lie between the
−1 to +1, where −1 indicates two images are opposite to each other and +1 indicates both the image are equivalent. The correlation
coefficient ‘𝑟’ is given as:
∑ ∑ ( )(
̄ 𝐵𝑚𝑛 − 𝐵̄
)
𝑚 𝑛 𝐴𝑚𝑛 − 𝐴
𝑟 = √( (1)
∑ ∑ ( )2 )(∑ ∑ ( )2 )
̄ ̄
𝑚 𝑛 𝐴𝑚𝑛 − 𝐴 𝑚 𝑛 𝐵𝑚𝑛 − 𝐵
Where, 𝐴 denotes the secret image matrix and 𝐵 denotes the recovered image matrix and 𝐴̄ and 𝐵̄ are means of 𝐴 and 𝐵,
respectively.
b. RMSE
Root mean square error is the square root of mean of the square of all the errors. It determines the different in quality between
two images. Lower the RMSE, higher the similar image. It can be calculated as:
√
√ 𝑚 𝑛
√1 ∑∑
𝑅𝑀𝑆𝐸 = √ (𝑋(𝑖, 𝑗 ) − 𝑌 (𝑖, 𝑗 ))2 (2)
𝑏 𝑖=1 𝑗=1
Where, 𝑚 × 𝑛 is the dimension of image and 𝑋(𝑖, 𝑗 ) is the pixel value at (𝑖, 𝑗 ) for the first image; 𝑌 (𝑖, 𝑗 ) pixel value at (𝑖, 𝑗 ) pixel
value at (𝑖, 𝑗 ) for the second image.
c. PSNR
Peak Signal Noise Ratio is used to measure the quality of images and is measured in decibel (dB) units. PSNR >= 20 dB indicates
good quality. The higher is the PSNR, the lower the error value. PSNR is given as:
𝑁2
𝑃 𝑆𝑁𝑅 = 10 log10 (3)
𝑀𝑆𝐸
Where, N is the maximum fluctuation in the input data.
An input image to be encrypted using our proposed approach is shown in Fig. 5(a). The shares S1, S2, S3, S4, S5, and S6 are
depicted in Figs. 5(b)-(g) and were created using Shamir’s (𝑘, 𝑛) threshold technique with 𝑘 = 6 and 𝑛 = 6. We used the neural key
exchange protocol that was proposed and encrypted the created shares using keys and shares. These shares must now be divided
among several people. The suggested neural key exchange mechanism enters the picture at this point.
Using the neural key protocol as described, a unique synchronised key will be created for each individual and distributor. The
distributor will ultimately use the key created during the TPM synchronisation procedure with each individual to encrypt the share
that he wants to send to that individual. Boolean XOR operation is used to perform the encryption. The relevant individual can be
sent the encrypted shares safely. Depending on the secret sharing policy, each person will either have one or both of these encrypted
shares. By once more conducting a Boolean XOR operation of the encrypted share and the appropriate key, they will be able to
decode the shares. The encoded image being fed to the masked auto-encoder decoder layers is shown in Fig. 5-(g). The decrypted
and decoded output image generated from the masked auto-encoder decoder layer is shown in Fig. 5-(h), which is the final output of
the proposed framework. The Lagrange Interpolation was used to recreate the original image from these shares.
Here, in this section we compared amongst images used and generated at different stages. The comparison is done by calculating
different metric as discussed earlier. Table 2 shows analysis of input images and encrypted shares and Table 3 shows analysis of input
image and output image. As in Table 2, correlation values are almost zero, which shows both the image are different and cannot
be traced in between of the layer even after having the TPM symmetric key. So, the proposed framework provides an additional
6
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Fig. 5. Experimental result of the proposed scheme on randomly selected image I_1: (a) Secret image. (b)–(g) Shares are generated using Shamir’s
secret sharing. (h) Encoded image being passed to the masked autoencoder decoder. (i) Decrypted Image.
7
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Table 2
Quantitative analysis of input image and encrypted shares for Image 1.
Table 3
Quantitative analysis of input image and output image.
Fig. 6. Quantitative analysis on 10,000 samples: average PSNR values between the input image and the encrypted shares. The red line denotes the
mean value.
level of security. Table 2 shows the negative correlation values for all the shares of image, which signifies that all the encrypted
shares are totally different than the original. Moreover, PSNR values is less than eight, which indicate lower quality of image in
comparison to original image and the same is signified by higher RMSE values. Additionally, Table 3 shows the analysis between the
input and output images. The values for different metric show the equivalency between the image and is also shown by the Fig. 4.
The correlation value is almost equal to 1 and PSNR is greater than 20 whereas, RMSE is low as less than 10, which signifies the
comparable quality of the output image.
In spite of this, we accept that our results may not be acceptable for the most critical of cases where an absolute lossless encryption
is must. To this end we would like to bring the following to the readers’ notice. Our approach combines neural cryptography-based
Shamir’s scheme and masked auto-encoders. Each of these approaches can be used as an encryption scheme itself. Specially, the second
part – masked autoencoders – being a soft computing deep learning approach don’t always guarantee to produce 100% accuracy.
That is, the original image will be exactly reproduced as it is, is not guaranteed always. Other encryption schemes are typically
deterministic in nature guaranteeing to reproduce the original input as it is. However, the advantage our proposed approach offers is
that (i) Neural cryptography is a new type of public key cryptography that is not based on number theory and requires less computing
time and memory, (ii) masked auto-encoders provide additional level of obfuscation through their deep learning architecture. This is
evident from Table 3 values that we have added. They show quantitative analysis of Input Images and Encrypted Shares. Table 3 shows
very high value of RMSE between the input image and encrypted share and extremely low value of correlation. This demonstrates
the utility of our proposed approach. Even if approach is not absolutely perfect in terms of delivering absolute lossless compression,
it exploits the advantage of neural cryptography and lays down foundation for further research in exploiting neural cryptography
and masked auto-encoders for image encryption.
We extended the above analysis for 10000 images randomly selected from the test dataset. Figs. 6–8 show the PSNR, RMSE, and
Correlation values, respectively, averaged between the input image and the shares of the 10000 images. In each figure, the line in
8
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Fig. 7. Quantitative analysis on 10,000 samples: average RMSE values between the input image and the encrypted shares. The blue line denotes
the mean value.
Fig. 8. Quantitative analysis on 10,000 samples: average correlation values between the input image and the encrypted shares. The orange line
denotes the mean value.
the middle denotes the average value. High RMSE values and almost zero correlation values indicate that input and encoded images
are different and cannot be traced in between of the layer even after having the TPM symmetric key. So, the proposed framework
provides an additional level of security.
Figs. 9–11 show the PSNR, RMSE, and Correlation values, respectively, between the input image and the reconstructed output
image on the 10000 images. The values for different metric show the equivalency between the image and is also shown by the Fig. 4.
The correlation value is almost equal to 1 and PSNR is greater than 20 whereas, RMSE is low as less than 10, which signifies the
comparable quality of the output image.
Here we have compared our proposed approach with two existing techniques [29,30]. Chen et al. [30] presented a method for
splitting a black and white secret image into multiple shares that, when overlapped a certain number (𝑘) together, reveal the original
image. This technique relies on randomly dividing the secret image into grids. A special program then picks random grids one by one
and hides a single black pixel within each chosen grid. The key feature lies in how these black pixels are hidden. Only by stacking
enough grids (𝑘 or more) will the black pixels from each grid overlap and disclose the secret image. Any fewer than 𝑘 grids will provide
9
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Fig. 9. Quantitative analysis on 10,000 samples: PSNR values between the input image and the reconstructed output image. The red line denotes
the mean value.
Fig. 10. Quantitative analysis on 10,000 samples: RMSE values between the input image and the reconstructed output image. The blue line denotes
the mean value.
Table 4
Comparison with existing techniques. Quantitative analysis between the input image and
the encrypted shares. Reported values were averaged over the set of 10,000 images.
no information about the hidden image. Gupta et al. [29] present a secure image sharing scheme that utilizes Shamir’s secret sharing
and neural cryptography for key generation. Neural cryptography offers a computationally efficient alternative to traditional public
key cryptography, making it suitable for resource-constrained environments. The proposed method ensures secure image sharing
without leakage of secret information during transmission over a public channel.
The comparison was done on the above set of 10,000 images randomly selected from the test dataset. The comparison was done
on (i) average PSNR, RMSE, and correlation values between input image and the encrypted shares (see Table 4), and (ii) PSNR, RMSE,
and correlation values between the input image and the reconstructed output image (see Table 5). It can be seen that in same cases,
10
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Fig. 11. Quantitative analysis on 10,000 samples: correlation values between the input image and the reconstructed output image. The orange line
denotes the mean value.
Table 5
Comparison with existing techniques. Quantitative analysis between the input image and
reconstructed output image. Reported values were averaged over the set of 10,000 images.
the proposed method does better than the other techniques, e.g., in Table 4 it has a lower correlation than Gupta et al.’s [29] method
and a higher RMSE than both Gupta et al.’s [29] and Chen et al.’s [30] methods when comparing the input image with the encrypted
shares. Overall, the proposed method does not manage to outperform existing approaches with respect to all the metrics. However,
other encryption schemes are typically deterministic in nature guaranteeing to reproduce the original input as it is. Whereas, the
advantage our proposed approach offers is that (i) Neural cryptography is a new type of public key cryptography that is not based
on number theory, requires less computing time and memory and is non-deterministic in nature, (ii) masked auto-encoders provide
additional level of obfuscation through their deep learning architecture. The proposed approach exploits the advantage of neural
cryptography and lays down foundation for further research in exploiting neural cryptography and masked auto-encoders for image
encryption.
Next, we do security analysis of the proposed technique. These analyses in part have been taken from [30].
Let us take 𝑇 = 25 as the security text, i.e., the text we want to encrypt. Next, we split the security text 𝑇 = 25 into 𝑝 = 6 pieces
(or shares). This split is such that any subset of 𝑘 = 3 shares is sufficient to reconstruct the security text 𝑇 . Next, choose two random
numbers, say 𝑛1 = 20 and 𝑛2 = 30. Then, our security polynomial is:
𝑔 (𝑥) = 𝑛0 + 𝑛1 𝑦 + 𝑛2 𝑦2 (4)
Where, 𝑛0 = 𝑇 = 25 is the security text. Then:
𝑔 (𝑥) = 25 + 20𝑦 + 30𝑦2 (5)
Next, we build 𝑝 = 6 points 𝐺𝑦−1 = (𝑦, 𝑔(𝑦)) from the polynomial:
𝐺0 = (1, 75) (6)
11
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
Now, the greater number of these shares an attacker finds, the more information he has to reconstruct the security text 𝑇 . Let
us consider that the attacker is able to crack two of these shares, i.e., 𝐺0 = (1, 75) and 𝐺1 = (2, 185), but fails to find the third point
(share), which 𝑘 = 3 is sufficient to reconstruct the security text 𝑇 . Nonetheless, the attacker tries to combine the information from
these two points 𝐺0 = (1, 75) and 𝐺1 = (2, 185), with the information in public domain and computes as follows:
𝑝 = 6, 𝑘 = 3
𝑔(𝑦) = 𝑛0 + 𝑛1 𝑦 + ⋯ + 𝑛𝑘−1 𝑦𝑘−1
𝑛0 = 𝑇 = 25, 𝑛𝑖 ∈ ℕ (12)
75 = 𝑇 + 𝑛1 1 + 𝑛2 12
75 = 𝑇 + 𝑛1 + 𝑛2 (14)
185 = 𝑇 + 𝑛1 2 + 𝑛2 22
185 = 𝑇 + 2𝑛1 + 4𝑛2 (15)
5. It is known that the attacker is aware of the fact that 𝑛2 ∈ ℕ. The attacker tries out substituting all possible values of 𝑛2 in Eq. (14),
i.e., 0, 1, 2, 3, …, to compute the corresponding possible values of 𝑛1 :
6. It must be noted that if the attacker continues as per the above and considers 𝑛2 = 36, a negative value will be obtained for 𝑛1 .
This is not possible because 𝑛1 ∈ ℕ. Therefore, the attacker must stop and he concludes that 𝑛2 ∈ [0, 1, … , 35, 36] must be true.
7. Next, the attacker can substitute 𝑛1 by Eq. (17) in (15) to get the following:
( )
75 = 𝑇 + 110 − 3𝑛2 + 𝑛2
𝑜𝑟, 𝑇 = 2𝑛2 − 35 (19)
8. Using the possible values of 𝑛2 and 𝑛1 from Eq. (18), the attacker computes the possible values of 𝑇 as follows:
𝑇 ∈ [−35 + 2 × 0, −35 + 2 × 1, … , −35 + 2 × 35, −35 + 2 × 36] (20)
12
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
In a brute force attack on NKEP, the attacker attempts to guess all possible weights. The weights are analogous to keys in other
encryption systems. Successful decryption requires the attacker to discover all the weights. If the attacker possesses TPMs identical
to those used by the sender and receiver, an attack may be feasible. The attacker could synchronize their TPM weights with those of
the sender and receiver. In the subsequent discussion, let 𝑆, 𝑅, and 𝐴 denote a sender, receiver and an attacker, respectively.
1. If the outputs of sender (S) and receiver (R) do not match, the weights in their respective TPMs remain unchanged.
2. In contrast, if the outputs of S, R, and attacker (A) are identical, all three parties update the weights in their TPMs.
3. However, if the outputs of S and R match but A’s output differs, only S and R update their TPMs. This scenario arises when the
learning rates of S and R exceed that of A.
A study in [30] revealed that an attacker can only modify 16% of the weights in NKEP on average. This percentage can be further
reduced by increasing the synaptic depth of the neural network. However, increasing synaptic depth directly increases computational
costs. Therefore, breaking the security of NKEP is computationally challenging due to its classification as an NP-hard problem.
The proposed technique employs Shamir’s Secret Sharing Scheme to generate secret shares using integer arithmetic. These shares
are then distributed securely using the Neural Key Exchange Protocol (NKEP). The generated shares are further encrypted with a
Neural Key. Now, consider one of these secret shares, and let’s assume it needs to be transmitted securely between two parties, 𝐴 and
𝐵. 𝐴 and 𝐵 establish a shared Neural Key by synchronizing their weights. However, during this synchronization process, two types
of attacks are possible:
In a brute-force attack, the attacker must exhaustively test all possible weight combinations. Since our key is represented by a
𝑘 × 𝑛 matrix, and after synchronization, each matrix element can only assume two values, +𝐿 or −𝐿, the total number of possible
combinations is 2𝑘𝑛 , resulting in non-polynomial time complexity.
Since the Tree Parity Machines (TPMs) of parties 𝐴 and 𝐵 generate binary outputs (𝜏(−1, 1)), they publicly share their outputs,
compare them, and synchronize their weights accordingly. Now, suppose a third party, 𝐶, also generates an output and attempts to
synchronize their TPM with those of 𝐴 and 𝐵. 𝐴 and 𝐵 will synchronize their weights twice as fast as 𝐶, rendering 𝐶 unable to obtain
the Neural Key.
To illustrate this, consider the eight possible outcomes from the three TPMs (𝐴, 𝐵, and 𝐶): (1, 1, 1), (1, 1, −1), (1, −1, −1), (1, −1, 1),
(−1, −1, −1), (−1, −1, 1), (−1, 1, 1), and (−1, 1, −1). 𝐶 can only synchronize at two of these outcomes: (1, 1, 1) and (−1, −1, −1), with
a probability of 41 . In contrast, 𝐴 and 𝐵 have four possible outcomes: (1, 1), (1, −1), (−1, 1), and (−1, −1), out of which they can
synchronize at two: (1, 1) and (−1, −1), with a probability of 12 .
Therefore, 𝐴 and 𝐵’s shared learning process is significantly faster than 𝐶’s individual learning process, effectively preventing 𝐶
from gaining access to the Neural Key.
Conclusions
In order to protect patient data, it is crucial to share medical photos securely. In this work, we provide a neural cryptography-based
encryption system for exchanging medical images. The suggested method for secret picture sharing is based on a tree parity machine
paired with masked auto-encoders. It aids in restoring picture loss brought on by noise when images are shared secretly. A novel kind
of public key cryptography called neural cryptography is used here. It is less memory and processing intensive since it is not reliant
on number theory. The dataset we used for our proposed scheme’s evaluation consisted of CT scans made available to the public by
The Cancer Imaging Archive (TCIA) [25]. The experimental findings were highly encouraging. In future, we intend to evaluate our
proposed framework on larger datasets and compare with contemporary methods.
Ethics statements
The experiments of this work do not involve human subjects or animals. Further, no data was collected from social media platforms.
The proposed scheme was evaluated on the dataset consisting of CT scans made public by The Cancer Imaging Archive (TCIA) [25].
During the revision of this manuscript the authors used Google Gemini in order to grammatically correct and rephrase certain
parts of the text for better understanding. After using this tool/service, the authors reviewed and edited the content as needed and
take full responsibility for the content of the publication.
13
K. Kumar, S. Tanwar and S. Kumar MethodsX 12 (2024) 102738
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to
influence the work reported in this paper.
Kishore Kumar: Conceptualization, Methodology, Software, Writing – original draft. Sarvesh Tanwar: Supervision, Writing –
review & editing. Shishir Kumar: Supervision, Writing – review & editing.
Acknowledgments
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
References
[1] V. Shaik, K. Dr. Natarajan, Flexible and cost-effective cryptographic encryption algorithm for securing unencrypted database files at rest and in transit, MethodsX
9 (2022) 101924.
[2] B. Zolfaghari, T. Koshiba, The dichotomy of neural networks and cryptography: war and peace, Appl. Syst. Innov. 5 (2022) 61.
[3] G. Ravikumar, K. Venkatachalam, M.A. AlZain, M. Masud, M. Abouhawwash, Neural Cryptography with Fog Computing Network for Health Monitoring Using
IoMT, Comput. Syst. Sci. Eng. 44 (2023) 945–959.
[4] M.U. Bokhari, Q.M. Shallal, A review on symmetric key encryption techniques in cryptography, Int. J. Comput. Appl. 147 (2016).
[5] S. Chandra, S. Paira, S.S. Alam, G. Sanyal, A comparative survey of symmetric and asymmetric key cryptography, 2014 international conference on electronics,
communication and computational engineering (ICECCE), 2014.
[6] S. Mitra, B.U. Shankar, Medical image analysis for cancer management in natural computing framework, Inf. Sci. (Ny) 306 (2015) 111–131.
[7] M. Kidoh, K. Shinoda, M. Kitajima, K. Isogawa, M. Nambu, H. Uetani, K. Morita, T. Nakaura, M. Tateishi, Y. Yamashita, others, Deep learning based noise
reduction for brain MR imaging: tests on phantoms and healthy volunteers, Magnetic Res. Med. Sci. 19 (2020) 195.
[8] X. Liu, L. Song, S. Liu, Y. Zhang, A review of deep-learning-based medical image segmentation methods, Sustainability. 13 (2021) 1224.
[9] Y. Chen, C. Tang, R. Ye, Cryptanalysis and improvement of medical image encryption using high-speed scrambling and pixel adaptive diffusion, Signal. Processing.
167 (2020) 107286.
[10] Y. Wu, L. Zhang, S. Berretti, S. Wan, Medical image encryption by content-aware dna computing for secure healthcare, IEEe Trans. Industr. Inform. 19 (2022)
2089–2098.
[11] T. Dong, T. Huang, Neural cryptography based on complex-valued neural network, IEEe Trans. Neural Netw. Learn. Syst. 31 (2019) 4999–5004.
[12] S. Jeong, C. Park, D. Hong, C. Seo, N. Jho, Neural cryptography based on generalized tree parity machine for real-life systems, Security and communication
networks 2021 (2021) 1–12.
[13] L.K.L. Ng, S.S.M. Chow, Sok: cryptographic neural-network computation, 2023 IEEE Symposium on Security and Privacy (SP), 2023.
[14] F. Aliabadi, M.-H. Majidi, S. Khorashadizadeh, Chaos synchronization using adaptive quantum neural networks and its application in secure communication and
cryptography, Neural computing and applications 34 (2022) 6521–6533.
[15] I. Kanter, W. Kinzel, E. Kanter, Secure exchange of information by synchronization of neural networks, Europhys. Lett. 57 (2002) 141.
[16] J. Wu, W. Xia, G. Zhu, H. Liu, L. Ma, J. Xiong, Image encryption based on adversarial neural cryptography and SHA controlled chaos, J. Mod. Opt. 68 (2021)
409–418.
[17] S.S. Dhanda, P. Jindal, B. Singh, D. Panwar, A compact and efficient AES-32GF for encryption in small IoT devices, MethodsX. 11 (2023) 102491.
[18] J. Daemen, V. Rijmen, “AES proposal: rijndael,” 1999.
[19] A.M. Shiddiqi, E.D. Yogatama, D.A. Navastara, Resource-aware video streaming (RAViS) framework for object detection system using deep learning algorithm,
MethodsX. 11 (2023) 102285.
[20] P. Kamat, S. Kumar, S. Patil, K. Kotecha, Anomaly-informed remaining useful life estimation (AIRULE) of bearing machinery using deep learning framework,
MethodsX. 12 (2024) 102555.
[21] V. Agrawal, J. Jagtap, S. Patil, K. Kotecha, Performance analysis of hybrid deep learning framework using a vision transformer and convolutional neural network
for handwritten digit recognition, MethodsX. 12 (2024) 102554.
[22] C.-Y. Liou, W.-C. Cheng, J.-W. Liou, D.-R. Liou, Autoencoder for words, Neurocomputing. 139 (2014) 84–96.
[23] C.W. Ko, J. Huh, J.-W. Park, Deep learning program to predict protein functions based on sequence information, MethodsX. 9 (2022) 101622.
[24] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, R. Girshick, Masked autoencoders are scalable vision learners, in: Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, 2022.
[25] K. Clark, B. Vendt, K. Smith, J. Freymann, J. Kirby, P. Koppel, S. Moore, S. Phillips, D. Maffitt, M. Pringle, others, The Cancer Imaging Archive (TCIA): maintaining
and operating a public information repository, J. Digit. Imaging 26 (2013) 1045–1057.
[26] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly and others, “An image is worth
16x16 words: transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
[27] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Adv. Neural Inf. Process. Syst. 30
(2017).
[28] A. Shamir, How to share a secret, Commun ACM 22 (1979) 612–613.
[29] T.-H. Chen, Y.-S. Lee, W.-L. Huang, J.S.-T. Juan, Y.-Y. Chen, M.-J. Li, Quality-adaptive visual secret sharing by random grids, J. Syst. Software 86 (2013)
1267–1274.
[30] M. Gupta, M. Gupta, M. Deshmukh, Single secret image sharing scheme using neural cryptography, Multimed. Tools. Appl. 79 (2020) 12183–12204.
14