A Review of DNA Cryptography
A Review of DNA Cryptography
keys [17]. Only the one possessing the correct key can retrieve DNA cryptography experiment to hide the military secret mes-
the original information. There are 2 main types of encryption sage “June 6 Invasion: Normandy” within DNA microdots. As
algorithms: (a) Symmetric encryption algorithms, such as the DNA cryptography is a nascent discipline, advancements are
advanced encryption standard (AES), data encryption standard largely inspired by the principles of conventional cryptography
(DES), and triple DES (3DES) algorithms, use the same key for and have led to methods such as DNA encryption and DNA hid-
both the encryption and decryption processes, typically offering ing, which will be discussed in detail.
fast operation suitable for large-scale information encryption.
(b) Asymmetric encryption algorithms, such as the Rivest– Molecular Biology Background of
Shamir–Adleman (RSA) and elliptic curve cryptography (ECC)
algorithms, also known as public-key encryption, employ a pair DNA Cryptography
of keys: a public key that can be shared openly for encryption To achieve the security of confidential information, DNA cryp-
and a private key that must be kept secret for decryption. This tography utilizes underlying principles and technologies involved
type of encryption algorithm is also commonly used for digital in genetic information processing, such as the triple codons
signatures, secure key exchange, and authentication. for amino acids, alternative splicing, polymerase chain reaction
Hiding conceals secret information by embedding it within (PCR) amplification, and gene manipulation operations [28,29].
nonsensitive data (cover), making it inaccessible or impercep- In this section, we provide a basic introduction to some relevant
tible to unauthorized users [18]. There are 2 main types of hid- biology concepts and biotechnologies.
ing techniques: (a) Steganography conceals secret information
within other media (such as images, audio, video, or docu-
ments), making the existence of the information undetectable. Double helical structure of DNA
This technique is typically applied in scenarios requiring high DNA is composed of 4 deoxyribonucleotide triphosphates
levels of secrecy, such as secret communication or hiding sensi- (dNTPs) [30]. Each dNTP consists of a deoxyribose sugar, a
tive information. (b) Digital watermarking embeds identifying phosphate group, and one nitrogenous base, adenine (A), gua-
information (such as copyright information) into digital media nine (G), cytosine (C), or thymine (T). Phosphodiester bonds
(such as audio, video, or images) to achieve copyright protection link the phosphate group at the 5′ position of one nucleotide
and content authentication. Steganography focuses on imper- to the 3′ position of the next, while the bases adhere to a spe-
ceptible concealment, while digital watermarking demands cific complement rule (A always pairs with T, and G with C).
greater robustness. Following this principle, 2 complementary DNA strands form
DNA cryptography is an extension of conventional cryptog- a stable double helix structure through hydrogen bonds (as
raphy into the field of life sciences, offering new mechanisms for shown in Fig. 2).
information security by exploring the potential capabilities
of biological properties and techniques. Since Adleman dem- Central dogma
onstrated the massive parallelism of DNA computation [8], The central dogma elucidates the basic principles of genetic
researchers have turned their attention to using DNA to decipher information transmission from DNA to ribonucleic acid (RNA)
conventional cryptographic algorithms such as DES [19,20], RSA and ultimately to protein (as shown in Fig. 3). Initially, the
[21,22], NTRU [23], Diffie–Hellman key exchange [24,25], and double helix structure of DNA partially unwinds at specific
knapsack [26]. Concurrently, Clelland et al. [27] conducted a locations. Under the guidance of transcription factors, RNA
Base operations
In previous studies, researchers designed base operations as
in digital computing to perform information transformation
Fig. 2. Schematic diagram of the double helical structure of DNA. On the left is shown using 2 DNA sequences [36]. Table 1 shows 3 rules for base
a general schematic of the double helix, while on the right is shown the nucleotide exclusive or (XOR), addition (ADD), and subtraction (SUB)
structure, hydrogen bonds (A–T and C–G pairing), and phosphodiester bonds. operations.
Fig. 3. Schematic diagram of the central dogma. Genetic information is transferred from DNA by transcription to pre-mRNA, spliced to form mRNA, and finally translated
into proteins by ribosomes.
Pseudo-DNA Cryptography logistic maps with DNA addition and base complement opera-
tions to scramble image information. This method effectively
“Pseudo-DNA cryptography” refers to introducing biological combines base operations with chaos theory, providing a promis-
complexity into conventional encryption and aims to provide ing perspective for image encryption. In 2012, Liu et al. [39]
enhanced security for binary data. In 2009, Kang Ning devised employed the piecewise linear chaotic map system to scramble a
a text encryption method by simulating the central dogma [37]. DNA-formatted image, simulating the DNA base pairing rules to
Specifically, the plaintext is encoded into DNA sequences, then modify each nucleotide. Zhang et al. [40] encoded and segmented
spliced and translated to produce a protein-formatted cipher- an image into multiple short DNA sequences and utilized
text. The secret key consists of the genetic code table used for the Chen hyperchaotic system for selecting positions to expand,
translation and the patterns and locations of splicing. This truncate, delete, or insert these sequences. Enayatifar et al. [41]
method offers powerful resistance to brute force attacks by encrypted images by integrating genetic algorithms with a logistic
leveraging the complex mechanisms of the gene translation map, effectively reducing the correlation among adjacent pixels.
process. Kalpana and Murali [42] increased encryption complexity by
In 2010, Zhang et al. [38] proposed an image encryption selecting distinct coding rules for different color channels (e.g.,
method (as shown in Fig. 5A) that integrates 1- and 2-dimensional red, green, blue). However, static encoding rules are insufficient
to defend against complex attacks. In 2019, Chai et al. [43] model with novel asymmetric base operations designed using
employed a chaotic system to dynamically select coding rules matrix calculations. Alawida [50] designed a DNA tree algorithm
for each pixel, designing an image encryption method sensitive and a new chaotic state machine map to encrypt images. The cha-
enough to resist known-plaintext and chosen-plaintext attacks. otic map integrates a finite state machine, while the DNA tree
In 2020, Zefreh [44] introduced 2 cyclic shift algebraic operators randomly generates a table that maps pixel values (0 to 255) to
based on DNA sequences to enhance DNA diffusion as shown in DNA bases (e.g., 150 to TAAA) and guides the performance of
Fig. 5B. Concretely, chaotic systems convert the original image DNA substitution operations (using a substitution box).
and secret key into DNA matrices. Subsequently, the image matri- Besides combining chaotic systems to design pseudo-DNA
ces are diffused at the DNA level by referencing the key matrices image encryption methods, researchers also employ pseudo-
and applying base operations such as addition, subtraction, right- DNA operations to develop other security techniques. Sadeg et al.
circular shift, and left-circular shift. Chidambaram et al. [45] [51] achieved a symmetric block encryption algorithm by design-
combined the chaotic-based key generation system with DNA ing new DNA operation rules and exploiting codon degeneration.
diffusion by base operations, ensuring the CIA of medical images Babu et al. [52] designed a pseudo-DNA communication model
stored in the cloud. To further enhance encryption security, that simulated DNA transcription and cooperative communica-
researchers explored more complex and higher-dimensional cha- tion among organelles. Thangavel and Varalakshmi [53] applied
otic systems. In 2023, Li and Chen [46] proposed a 6-dimensional the central dogma to secure data in the cloud, providing a more
hyperchaotic system and base XOR operations to encrypt color randomized and prudent system in practice. Majumdar et al. [54]
images. In 2024, Yu et al. [47] developed a dynamic encryption utilized the redundancy of alternative splicing to design a substi-
network incorporating 12 types of base operations where a hyper- tution box for encryption and stored related pieces of infor-
chaotic system was used to generate random keys. Wu et al. [48] mation, such as time, date, and owner ID, in introns to ensure
proposed a medical image encryption method consisting of ran- information integrity.
dom DNA encoding and content-aware DNA permutation and Furthermore, researchers utilized the randomness of genetic
diffusion. The former utilizes the piecewise linear chaotic map information to select natural DNA sequences as secret keys. In
system to select different binary-to-base encoding rules for each 2015, Najaftorkaman and Kazazi [55] chose natural DNA from
pixel, while the latter breaks the correlation between adjacent the National Center of Biotechnology Information (NCBI)
pixels using the hyperchaotic Lorenz system and base operations gene library as a secret key to implement the Vigenère cipher
(ADD, SUB, and XOR). Liu et al. [49] proposed a medical image encryption [56] at the DNA level as shown in Fig. 5C. Similarly,
encryption method integrating a new spatiotemporal chaotic Grass et al. [57] used short tandem repeat segments from the
Fig. 6. The one-time pad encryption method creates a codebook of <plain, cipher> DNA word pairs to replace plaintext (P) with ciphertext (C) [61].
Fig. 8. The schematic diagram for DNA chip encryption [84]. Two predefined sets of sequences represent “0” and “1”. A randomly selected group of probes is spotted on the
DNA chip. Users can only discriminate between these 2 types of spots using the complementary strands of the probe set of “1”.
primer pairs and the order of these DNA segments. Considering sequences (probes) onto a solid surface and then hybridizing
the security risk of primer leakage, Li et al. [82] designed a them with labeled samples of complementary DNA. The inten-
pre-key mechanism by utilizing the dual cutting capabilities sity of the fluorescence indicates gene expression levels. DNA
of CRISPR/Cas12a. True primers were mixed with decoy prim- chips enable researchers to study gene activity in response to
ers to form a pre-key. Only after the precise cutting of CRISPR/ conditions like diseases or drug treatments, aiding in cancer
Cas12a does this mixture reveal the true key to decrypt the research, drug development, and personalized medicine.
real information. In 2021, Fan et al. [83] utilized Pfu DNA poly- Usually, it is difficult to precisely know the short probes
merase to concatenate 2 DNA molecules of distinct chirality end (~30 nt) spotted on a DNA chip, especially when each spot
to end. This method ensures that natural PCR amplifies only the is mixed with many different probes. In 2007, Lu et al. [84]
misleading information contained in D-DNA. Conversely, the developed a symmetric encryption as shown in Fig. 8. First,
true information encoded in L-DNA can be accessed only they designed 2 sets of DNA probes representing “0” and “1”.
through mirror-image PCR. Then, they printed a randomly selected group of probes from
the corresponding set on each spot of the chip. The biological
DNA chips difficulty comes from the tremendous number of probes for
A DNA chip (or DNA microarray) is a high-throughput tech- “0” or “1” and the versatility of probes in each spot. Only when
nology used to analyze the expression levels of thousands of the probes for “1” are known can the receiver decode the
genes simultaneously. It involves immobilizing short DNA printed ciphertext. When relaxed to nonspecific hybridization,
Fig. 9. Schematic diagram of DNA origami encryption [89]. Secret information is encoded into a tactile grid pattern and meticulously marked onto the scaffold of DNA origami.
Only with all the correct staples can the scaffold be deconstructed from a disordered state back to a state where the braille information is clearly visible.
compared with other techniques, the information density of empirical data, advanced modeling techniques, and in-depth
DNA chips and DNA origami is extremely low (<<1 bit/nt) biological knowledge. Moreover, no standard protocol exists to
because the probes (~30 nt) in each spot only represent one bit test the security of the system under malicious biological attacks.
of information. Therefore, a quantitative measure of these complexities is neces-
Although various DNA cryptography methods have been sary to guarantee the security and efficiency of the system and
proposed by researchers, they are far from being practically to boost the advancement of DNA cryptography.
applied. To develop a solid and reliable cryptographic system Second, a well-established protocol (e.g., key generation,
that is compatible with biological technologies, there are several distribution, and management) has not yet been developed in
main challenges, as follows: DNA cryptography, which hinders its application in practical
First, it is difficult to quantify the biological difficulties inher- scenarios. Conventional cryptography has developed a set of
ent in biochemical reactions, which form the essential basis for standard protocols and algorithms, such as substitution, per-
DNA cryptography. The overall complexity is a function of the mutation, confusion, diffusion, and encryption/decryption
number of biochemical reaction steps (n), the variability of algorithms, which form the foundation for various symmetric
reaction conditions (V), the uncertainty in enzyme kinetics (U), encryption algorithms, such as AES and DES. To achieve reli-
and the influence of external biological factors (F). Variations able data encryption, we should also construct biologically
in reaction conditions can considerably impact the outcome as compatible protocols in DNA cryptography.
these factors are highly interrelated. Quantifying and determin- Third, the unique IDS noise in DNA storage can consider-
ing the exact functional relationship require a combination of ably impact DNA encryption. DNA encoding and PCR both