0% found this document useful (0 votes)
26 views38 pages

Tarun Kumar Introduction To Dna Computing 2023

Uploaded by

Deeva Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views38 pages

Tarun Kumar Introduction To Dna Computing 2023

Uploaded by

Deeva Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

CHAPTER ONE

Introduction to DNA computing


Tarun Kumara and Suyel Namasudrab
a
Department of Computer Science and Engineering, National Institute of Technology Patna, Bihar, India
b
Department of Computer Science and Engineering, National Institute of Technology Agartala, Tripura, India

Contents
1. Introduction 2
2. Fundamentals of DNA computing 5
2.1 Background of human DNA 6
2.2 DNA computing 17
2.3 Operations of DNA computing 22
3. Applications of DNA computing 25
4. Conclusions and future works 33
References 34
About the authors 37

Abstract
Currently, Deoxyribonucleic Acid (DNA) computing is considered as one of the
advanced fields of Information Technology (IT) industries. DNA computing is a tech-
nique inspired from biological science that makes the use of DNA bases, namely
Adenine (A), Guanine (G), Thymine (T), and Cytosine (C), for operations and as an infor-
mation carrier. L. M. Adleman first brought forth this concept in 1994. DNA computing
has many advantages, such as parallel computing, large storage capability, minimal
power requirement, and molecular computation. Nowadays, it is used in different fields,
including cryptography, steganography, big data storage, quantum computing, DNA
chip, and medical application. In this chapter, the fundamentals of human DNA and
DNA computing, including the structure of DNA, polymerase chain reaction, history,
advantages, disadvantages, operations of DNA computing, etc., have been discussed
in detail. In addition, many applications of DNA computing in different fields, namely
cryptography, steganography, big data, cloud computing, DNA chip, medical research
are also presented, which can be highly beneficial for the researchers, academicians,
and other professionals doing their research in DNA computing.

Abbreviations
A adenine
ABE attribute-based encryption
AES advanced encryption standard
ATP adenosine triphosphate

Advances in Computers, Volume 129 Copyright # 2023 Elsevier Inc. 1


ISSN 0065-2458 All rights reserved.
https://doi.org/10.1016/bs.adcom.2022.08.001
2 Tarun Kumar and Suyel Namasudra

C cytosine
DNA deoxyribonucleic acid
dNTPs deoxynucleoside triphosphates
ER endoplasmic reticulum
G guanine
GA genetic algorithm
H hydrogen
HPP hamiltonian path problem
IBE identity-based encryption
IT Information Technology
Mg magnesium
Mn manganese
NP nondeterministic polynomial
OH hydroxyl group
PCR polymerase chain reaction
RNA ribonucleic acid
T thymine
U uracil

1. Introduction
DNA is a long molecule that contains the majority of the genetic
information needed for the reproduction and development of the body.
All cells in the human body have the same DNA, which is responsible
for all aspects of the cell. The coding scheme of the components of the
DNA molecule is mainly responsible for the complexity and organization
of all human beings. A double helix structure is formed by two biopolymer
strands curving in the opposite direction in a DNA molecule [1]. Here, a
nucleotide can be defined as an organic molecule, which is the basic building
block of DNA and RNA. Each nucleotide is made up of three distinct parts:
(1) a 5-carbon sugar (2) a nitrogen base, and (3) a phosphate group. There are
four nitrogen bases in DNA identifying A, G, T, and C. In DNA, A is always
paired with T, and C is always paired with G as per the base-pairing law [2].
Combinations of random DNA bases can be used to produce a large number
of DNA sequences. The structure of DNA is depicted in Fig. 1. In DNA
computing, genetic information is encoded in many computer systems using
the concept of DNA, molecular biology, hardware, and biochemistry. It is a
form of parallel computing that takes the advantage of a large number of dif-
ferent DNA molecules. Experiment, theory, and implementation are the
Introduction to DNA computing 3

Deoxyribonucleotide (dNTP) Bases


Adenine
Base
Sugar
Cytosine
Phosphate
Guanine
5¢ 3¢
Thymine
OH

Sugar-Phosphate Backbone 3¢



Complementary Bases (Paired via Hydrogen Bonds)

Thymine Adenine Guanine Cytosine

Fig. 1 DNA structure.

fields of DNA computing-based research and development. Unlike the con-


ventional binary digits, i.e., 1 and 0, data or file in DNA computing is pri-
marily expressed using four genetic alphabets, i.e., A, G, T, and C. It is
possible because short DNA molecules of some random sequences can be
easily synthesized. If an algorithm is considered, DNA molecules can be used
to give an input of that algorithm with some specific sequences, and the
instructions are given to the molecules by laboratory operations to produce
the final collections of molecules.
In 1994, a scientist at the University of Southern California, L. M.
Adleman, has first suggested utilizing the theory of DNA for computation
[3]. Adleman has demonstrated a proof-of-concept by solving the seven-
point Hamiltonian Path Problem (HPP) using DNA as a method of com-
putation. This novel technique was used to resolve Nondeterministic
Polynomial time (NP-hard) problems. It has been quickly realized that
DNA computing may not be the best solution for this problem [4]. In
1999, computer scientists, Ogihara and Ray [5] have implemented DNA
computing-based Boolean circuits. DNA computing is currently one of
the trending fields of biology and computer science. Significant knowledge
in both computer engineering and DNA molecule is needed to develop an
efficient DNA computing-based algorithm. DNA computing can be used to
4 Tarun Kumar and Suyel Namasudra

solve and calculate scientific problems and equations that are difficult to
solve using the current data sharing and storage techniques. There are
two main reasons for its popularity. It requires: (1) a least amount of storage,
and (2) minimal power. DNA can store memory space at a density of about 1
bit per cubic nanometer [6]. High computation power by combining the
DNA bases and the complex structure of DNA attracts a large number of
researchers to use DNA computing in a variety of fields. One of the most
critical issues in DNA computing is how researchers can reduce the likeli-
hood of errors occurring during execution. Several studies have been carried
out to show that by using the appropriate encoding technique, the efficiency
of DNA computing can be greatly improved [7].
DNA computing is now widely used in many fields, including cryptog-
raphy [8], nanotechnology [9], combinatorial optimization [10], Boolean
circuit construction [11], data storage [12], quantum computing [13], and
many more [14]. Nowadays, around 2.5 quintillion bytes of data are pro-
duced every day by humans. As most of the data are transmitted over the
internet, numerous attackers and malicious users always try to achieve
unauthorized access to the data. Thus, the application of DNA computing
in cryptography is most popular to encrypt any data. In DNA-based cryp-
tography, data are encrypted by using the nitrogen bases, i.e., A, T, G, and
C, instead of 0 and 1. Many researchers have proposed schemes to improve
data security by using DNA computing [15]. Here, a complete DNA
encoding character set must include alphabets, numbers, and special charac-
ters. The mapping table, which provides the complete characters of DNA
must be automatically generated [16]. Otherwise, again data may face secu-
rity issues. The term steganography refers to the practice of concealing the
presence of data. In DNA-based steganography, data are hidden using the
DNA bases. This technique is straightforward that only transforms the plain-
text into a DNA sequence. Basically, it does not encrypt any data, and hides
the plaintext, i.e., DNA sequence of the plaintext within other DNA
strands. Anyone who knows the primers can effortlessly find the location
of the original DNA sequence or plaintext.
DNA computing is also used in quantum computing. Quantum com-
puting aims to create computers focused on quantum theory that is capable
of dealing with the existence and behavior of all quantum aspects, while
DNA computing can focus on storage. Therefore, a combination of both
fields can change the world. The DNA molecules can be directly used in
quantum computers either by using nuclear magnetic resonance or by dopp-
ing the DNA molecules for implementing the quantum gate. Companies,
such as Microsoft, IBM, Google, etc., are investing in quantum computing.
Introduction to DNA computing 5

DNA computing is also used in cloud computing. In the IT industry, cloud


computing is widely used for storing users’ confidential data. As the size of
data is gradually increasing, it also creates issues for a user to store the big
data. DNA computing can be very useful to store any big data in a cloud
environment as 1 g of DNA bases can store around 700 TB of data. Thus,
some gram of DNA bases can store all data of the world. Here, the data
owners of a cloud environment first convert the data into its binary form,
and then, the binary form of the data is converted into a DNA sequence
using any DNA encoding rule followed by any complementary pair-rule
[17]. DNACloud, a tool that is designed to store the data in the form of
DNA sequences [18]. This tool mainly performs three tasks: (1) encoding
of data into DNA sequences (2) estimating the requirements for the storage
of data on DNA, and (3) decoding of DNA sequencing to the original data.
Many researchers are exploring this field [19].
The main aim of this chapter is to discuss all the basic details of DNA
computing, so that researchers and professionals can use the concept of
DNA computing in related fields. The fundamentals of DNA are discussed
on the molecular level, including Polymerase Chain Reaction (PCR), the
history of DNA computing, and many more. Furthermore, this chapter dis-
cusses the advantages and disadvantages of DNA computing. Many applica-
tions of DNA computing are also discussed, which can be highly beneficial
to the researchers doing their research in DNA computing. The key con-
tributions of this chapter are as follows:
(1) Fundamentals of human DNA and the concept of polymerase chain
reaction are discussed in this chapter.
(2) The basics of DNA computing and Adleman’s experiment to solve the
HPP are presented in detail.
(3) Finally, many applications of DNA computing in several fields are also
explained.
The rest of the chapter consists of several parts. In Section 2, all the fundamentals
of DNA computing, such as the background of DNA, PCR, DNA computing,
history, advantages, and disadvantages of DNA computing, are discussed in
detail. Applications of DNA computing are discussed in Section 3. Finally,
the entire chapter is concluded in Section4 along with some future works.

2. Fundamentals of DNA computing


In this section, background and structure of human DNA, polymerase
chain reaction, advantages, disadvantages, and operations of DNA comput-
ing, are discussed in detail.
6 Tarun Kumar and Suyel Namasudra

2.1 Background of human DNA


Biologically, deoxyribonucleic acid is a lengthy molecule that stores the
instructions of all the workings of human bodies, and it is transferred from
parents to the child. It contains information that supports the cells to make
proteins, and proteins are responsible for many essential functions of the
human body, such as digestion, cell building, muscle movements, and many
more [20]. A cell can be defined as the building block of the human body. All
the cells of the human body hold the same DNA, and there are approxi-
mately trillions of cells in the human body for providing the main structure
of the human body. Human cells have many parts as mentioned below,
which are responsible for different functions:
(1) Cytoplasm: It is made up of cytosol, which is a jelly-like fluid.
Cytoplasm mainly covers the nucleus along with other structures.
(2) Cytoskeleton: It is a long network of fiber, which builds the cell’s
framework. The cytoskeleton is responsible for the shape of the cell,
cell division, movement of cells, and many more.
(3) Endoplasmic Reticulum (ER): ER is responsible for transporting
the molecules, i.e., inside or outside the cell towards their destination.
(4) Golgi Apparatus: Golgi apparatus bundles the processed molecules,
and the endoplasmic reticule transfers Golgi apparatus to the other cells
as per requirement.
(5) Lysosomes and Peroxisomes: These organelles are mainly the centre
of recycling in the cell, which digest bacteria for invading the cell, rid
the cell, and recycle the cell.
(6) Mitochondria: Mitochondria is one of the complex organelles, which
converts the energy into the form for use by the cell. To separate the
mitochondria from the DNA in the nucleus, mitochondria has the
genetic material.
(7) Plasma Membrane: It is the exterior outer layer of each cell. The
plasma membrane separates each cell from its own environment and
allows the materials to exit and join the cell.
(8) Ribosomes: The organelles that process the cell’s genetic instructions
for the formation of protein are known as ribosomes. In the cytoplasm,
ribosomes can move freely.
Genes are basically the functional and physical unit of heredity. These are
made up of DNA. In a human body, genes usually vary in size from a small
number of DNA bases to more than 2 million DNA bases. Many genes act as
the instructions for making the molecules called proteins. But, some genes
Introduction to DNA computing 7

do not code for proteins. There are two copies of every gene in the human
body in which a single copy is inherited from the mother’s genes, and
another copy is inherited from the father’s genes. Most of the genes in all
human bodies are almost the same. However, there are differences in the
number of genes among people. This difference is less than 1% of the total
genes. Here, Alleles are the forms of the identical gene with minor changes
in the sequence of DNA bases, and these minor changes are responsible for
the unique physical features of every person.

2.1.1 Structure of human DNA


DNA molecule plays an important role in DNA computing. The biochem-
ical field consists of small and large molecules, monomers, and polymers.
DNA is a kind of polymer made up of monomers known as deoxyribonu-
cleotides. DNA is a vital molecule in a cell, and it has an interesting structure
that supports two primary functions of DNA: (1) self-replication, and
(2) protein synthesis coding, which ensures that the same copy is passed
to offspring cells. As mentioned earlier, deoxyribonucleotide is made up
of three components: (1) sugar (2) nitrogenous base, and (3) phosphate
group. Deoxyribose, the sugar of DNA explains the prefix term, i.e., deoxy-
ribo. The term nucleotide is used instead of deoxyribonucleotides to sim-
plify the terminology. Deoxyribose is made up of five carbon atoms,
which are numbered for ease of reference. As the base also consists of car-
bons, the sugar’s carbons are numbered from 10 to 50 to avoid confusion. The
10 carbon is attached to the base, while the 50 carbon is attached to the phos-
phate group. A Hydroxyl Group (OH) is bound with the 30 carbon in the
sugar structure [11].
The only difference among nucleotides is their bases, which are divided
into two categories: (1) purines, and (2) pyrimidines. In nucleotides, there
are two purines: (1) adenine, and (2) guanine, and two pyrimidines: (1) cyto-
sine, and (2) thymine. As nucleotides are only distinguished by their bases
based on the type of the base, they are referred to as A, G, C, and T.
Fig. 2 depicts the composition of nucleotide, where one of the possible
DNA bases is at node B, a phosphate group is P, and the sugar base (carbons
10 to 50 ). Fig. 3 shows the standard and simplified chemical structure of a
nucleotide.
Ribonucleic Acid (RNA) is another important polymer for human
cells. It has an almost similar structure to DNA. It is made up of ribonucle-
otides, which are monomers. Ribonucleotide with Adenine base and a
triple phosphate group is referred to as an Adenosine Triphosphate (ATP)
8 Tarun Kumar and Suyel Namasudra

Fig. 2 A simple representation of a nucleotide.

Fig. 3 A nucleotide’s chemical structure with thymine.

molecule. It is the primary source of energy in human cells. There are two
ways to distinguish ribonucleotide and nucleotide:
(i) It contains a ribose sugar, which is distinguished from deoxyribose
sugar by the presence of the hydroxyl group on the 20 carbon rather
than the Hydrogen (H).
(ii) In RNA, the Uracil (U) base replaces the thymine base. Thus, there are
four bases in RNA, namely A, U, C, and G.
Nucleotides can be bound to each other in two ways:
(i) As shown in Fig. 4, 3’-OH of one nucleotide is bound to the
50 -phosphate group of another nucleotide. Thus, it forms a pho-
sphodiester bond, which is a tight covalent bond. It gives the direction
Introduction to DNA computing 9

Fig. 4 Phosphodiester bond.

of molecules referred to as the 50 -30 or 30 -50 direction. These directions


are critical for understanding the functionality and the entire processing
of DNA.
(ii) A weak hydrogen bond is formed, when one nucleotide’s base interacts
with the base of another nucleotide. The base pairing constraints are A
is always paired with T and C is always paired with G. No other base
pairing is possible.
The pairing principle is known as the Watson-Crick complementary pairing
rule. In 1953, James D. Watson and Francis H. C. Crick have introduced this
famous double helix structure of DNA [21]. It is crucial to comprehend the
function and structure of DNA. A thin wiggly line between the bases shown
in Fig. 5 reflects the fact that the hydrogen bond is much weaker than the
phosphodiester bond. Most importantly, due to the pairing of A and T, there
are two hydrogen bonds between the two nucleotides, and three hydrogen
bonds occur because of the pairing between C and G. Therefore, the pairing
between C and G is more powerful than the pairing between A and T, and
more energy is required to separate the pairing between C and G. For reflecting
the differences, it uses two wiggly lines for the pairing between A and T and
three wiggly lines for the pairing between C and G pairing.
The single-stranded DNA can be formed by using phosphodiester
bonds. As shown in Fig. 6, the nucleotides with the free 50 phosphate
and free 30 hydroxyl are given on the left end and right end, respectively.
As nucleotides can be identified by their bases, when naming them by using
A, G, C, and T, a single strand can also be represented as a sequence of letters.
For example, 5’ ACG is the single strand form in Fig. 6.
In practice, the hydrogen bond between single nucleotides is too weak to
hold the two nucleotides bound together, thus, longer stretches are required
10 Tarun Kumar and Suyel Namasudra

Fig. 5 Hydrogen bond.

Fig. 6 Single-stranded DNA.

to bond them together. A stable bond is created by the cumulative effect of


hydrogen bonds between complementary pair bases in DNA.

2.1.2 Structure of double helix


The double-stranded DNA molecule can be formed from the single-
stranded DNA molecule by using the Watson-Crick complementary rule.
Introduction to DNA computing 11

Fig. 7 Forming double strands.

Fig. 7 depicts how to join two single-stranded DNA molecules by using a


hydrogen bond. In a double-stranded molecule, two single strands are placed
in opposite directions. Here, one nucleotide’s 50 end is bonded 30 end of
another nucleotide. It is a standard practice to draw the upper strand of a
double-stranded molecule from left to right in the 50 to 30 direction, and
the lower strand must be drawn from left to right in the 30 to 50 direction
[11]. It is shown in Fig. 7 that the upper strand is 5’ ACG and the lower
strand is 3’ TGC.
The double-stranded representation of a DNA molecule in terms of two
linear strands bound to each other by the Watson-Crick complementary
pair-rule has a significant simplification because two strands in a DNA mol-
ecule are wrapped around each other to form the structure of the double
helix as depicted in Fig. 8.
In vivo DNA replication, the condition is very complicated as a massive
DNA molecule must fit into a small cell. This process is very complex, and it
is performed hierarchically in many stages in more complex cells (eukary-
otes). When considering processes in human cells, the shape of a DNA
12 Tarun Kumar and Suyel Namasudra

Fig. 8 Structure of double helix.

molecule is extremely important. It also must be noted that the human DNA
molecules are linearly structured. However, bacterial DNA is often circular.
Circular molecules can be simply constructed by forming a phosphodiester
bond between the last and first nucleotide. The ability for processing DNA is
very important in genetic engineering, as well as in DNA computing [21].

2.1.3 Polymerase chain reaction (PCR)


PCR is a widely used technique for making millions to billions of copies of
a particular DNA sample. It is used for the amplification of a small DNA
strand to a large amount of DNA bases. The copies can be partial copies
Introduction to DNA computing 13

or complete copies. Karry Mullis, an American biochemist, the recipient of


the 1993 Nobel Prize, has invented it in 1983 [22]. This method is based on
the natural processes that a cell goes through as it replicates a new strand of
DNA. For PCR, only a few biological components are required. Thermal
cycling is used in the majority of PCR processes, which is the process of
heating and cooling reactants repeatedly in order to enable temperature-
dependent reactions, including DNA melting and enzyme-driven DNA
replication. There are two key reagents in PCR: (1) primers, and (2)
DNA polymerase. These are short single strands of DNA with a comple-
mentary sequence to the target DNA region, typically about 20 nucleotides
long. In PCR, at first, two strands of the DNA, i.e., double helix are
separated physically at a very high temperature, which is known as
nucleic acid denaturation. Then, it is heated at comparatively less tempera-
ture and the primers are bound to the complementary DNA sequences.
Thus, two separate DNA strands become templates for DNA polymerase
to enzymatically assemble a new DNA strand from the free nucleotides.
When PCR progresses, the newly generated DNA is used as a template
to replicate. Therefore, a chain reaction is generated by which the original
or main DNA template or sequence is exponentially amplified.
Maximum PCR application processes support a heat-constant DNA
polymerase like Taq polymerase [23]. Taq polymerase is suitable for PCR
because of its heat stability. Otherwise, DNA polymerase must be added
in every cycle, which is a costly and tedious process. Several components
and reagents are needed as mentioned below for a simple PCR setup:
(1) DNA template containing the DNA target region for amplifying.
(2) DNA polymerase, which is an enzyme to polymerase new strands of
DNA. Here, Taq polymerase is commonly used.
(3) Two DNA primers, which are complementary to 30 ends of the DNA
target’s sense and anti-sense strands.
(4) Deoxynucleoside Triphosphates (dNTPs), the molecules from which a
new DNA strand is generated by DNA polymerase.
(5) DNA polymerase uses deoxynucleotide triphosphate nucleotides with
triphosphate groups as building blocks to create a new DNA strand.
(6) Buffer solution that gives an environment for optimal DNA polymerase
operation and stability
(7) Bivalent cations, classically Manganese (Mn) or Magnesium (Mg) ions
are commonly used.
As depicted in Fig. 9, there are six main steps in PCR, namely initialization,
denaturation, annealing, extension, final elongation, and final hold, that are
14 Tarun Kumar and Suyel Namasudra

Fig. 9 Process of PCR.

repeated 20 to 40 times in which temperatures are changed, known as ther-


mal cycles. In each cycle, there are 2 to 3 discrete temperature steps [24].
(1) Initialization: In the initialization step, heat is activated for DNA
polymerase. Here, the reaction chamber is heated at the temperature
of 94 °C to 96 °C. Here, 98 °C temperature is used, if tremendously
thermostable polymerase is used. This process lasts for 1 to 10 min.
(2) Denaturation: The reaction chamber is heated at 94 °C to 98 °C for
20 to 30 s in this stage, which is the first routine cycling event. As the
hydrogen bonds between the complementary pair bases are broken, this
denaturation process results in the double-stranded DNA template to
melt and two single-stranded DNA molecules are generated.
(3) Annealing: This step enables primers of both the single-stranded
DNA templates to be annealed by lowering the reaction temperature
Introduction to DNA computing 15

(i.e., 50 °C to 65 °C) for 20 to 40 s. In most cases, the two distinct


primers are used in the reaction each for both single-stranded com-
plements comprising the mark or target region. Since the annealing
temperature has a significant impact on efficiency and specificity, deter-
mining the correct temperature for this step is very difficult. This deter-
mined temperature must be low, so that the primer must be only bound
to the flawlessly complementary portion of the single strand. The
primer cannot bind properly, when the temperature is much less.
There is also a possibility that the primer may not bind at all, if the tem-
perature is much high. When the primer closely matches the main or
template sequence, stable hydrogen bonds are formed between the
complementary pair bases. In this step, DNA formation starts by bind-
ing the polymerase with the primer-template hybrid.
(4) Extension: In this step, the temperature is determined based on the
used DNA polymerase. The optimal temperature for Taq polymerase
is approx. 75 °C to 80 °C. However, is widely used for this enzyme.
By inserting free dNTPs from the reaction, which is complementary
in the direction 50 to 30 of the template, the DNA polymerase produces
a new strand complementary to the main or template DNA strand. The
size of the target DNA region for the amplification, as well as the used
DNA polymerase, determine the exact time needed for elongation.
At the optimal temperature, a thousand DNA bases are polymerized
per minute by most of the DNA polymerases. The amount of target
DNA sequences gets double in each elongation phase under ideal con-
ditions. The original main DNA strands along with all the newly
formed DNA strands become the template strands for the next elonga-
tion cycle, which results in an exponential amplification of the same
target DNA sequences in each subsequent cycle.
All of the above steps are part of the same cycle. Numerous cycles
are needed for amplifying the target DNA sequences millions of times.
The total amount of DNA copies generated after a specified number of
cycles is calculated by using the formula 2n, where n is denoted as the
number of rounds or cycles. Therefore, a reaction of 30 cycles produces
230 or 1,073,741,824 copies of the original target DNA sequence.
(5) Final Elongation: The final elongation step is optional. However, it is
executed for 5 to 15 min after the last PCR cycle at 70 °C to 74 °C tem-
perature. This temperature is optimal for mostly used polymerases in
PCR for ensuring that any leftover single-stranded DNA must be
completely elongated.
16 Tarun Kumar and Suyel Namasudra

(6) Final Hold: This step is mainly used for cooling the reaction chamber
from 4 °C to 15 °C for an unspecified period of time.

2.1.4 Applications of polymerase chain reaction


There are many applications of PCR in interdisciplinary research areas by
making it possible to retrieve DNA sequences. Some are discussed in this
subsection.
(1) Human Health and Genome Project: For improving human health
and quality of life, PCR is an essential technique. PCR is mainly used in
biological science to identify infectious organisms and to diagnose dis-
orders in genes. It can also be used to analyze virus DNA to diagnose
deadly diseases like AIDS, and it is more accurate than ELISA’s stan-
dardized identification. Furthermore, PCR may be utilized to diagnose
diseases, including Lyme by recognizing the function of bacterial DNA
strands. In addition, PCR is used to inhibit sexual transfer diseases.
The research on all human genes is generally referred to as the human
genome project. The identification of unique genes, mutations in these
genes, and rates of mutation are greatly aided by PCR.
(2) Genetic Fingerprinting: In forensics, genetic fingerprinting is used to
match a person’s DNA with a blood sample, parent’s DNA, etc. Here, a
very small amount of DNA is used in most cases. So, the main role of
PCR is to amplify the small amount of DNA from blood, hair, semen,
and other tissues. Then, the gel electrophoresis technique is further used
to examine the amplified DNA fragments.
(3) Detection of Hereditary Disease: As hereditary diseases are directly
connected to genes, it is difficult to identify them. However, PCR is
one of the best solutions to analyze this complex process. PCR can help
to amplify the number of DNA, and an appropriate marker can be used
to identify the disease-causing gene.
(4) Cloning: In industry and research laboratories, gene cloning has many
applications and PCR can be utilized for gene cloning. PCR is used for
amplifying the gene, if a gene needs to be cloned. Then, it is loaded into
a vector followed by transporting the gene by a vector. Thus, PCR is
essential to study gene cloning [25].
(5) Ancient DNA Analysis: In ancient DNA analysis, PCR is used. Here,
the DNA separated from Egyptian mummies is amplified by using PCR
to identify a person and further studies. The gene may also be associated
with the most current gene and used to do evolutionary research.
Introduction to DNA computing 17

(6) Gel Electrophoresis: This technique separates DNA strands in differ-


ent sizes by pulling fragments of DNA through a gel matrix with an
electric current. Here, DNA ladder, a unit of size of DNA sequence
is usually included to determine the size of the fragments in PCR.
By analyzing the size of the DNA sequences, many medicines are devel-
oped for critical diseases.

2.2 DNA computing


DNA computing is a branch of natural computing that uses the molecular
properties of human DNA to perform logical and arithmetic operations
instead of binary bits 0 and 1. This enables massively parallel computation,
making it possible to solve complex mathematical problems in a fraction of
time. Therefore, in DNA computing, computation is much more effective
with a large amount of self-replicating DNA than with a conventional
computer. A computation can be considered as an execution of an algorithm
that has some well-defined set of instructions, which accepts inputs, pro-
cesses them, and generates some outputs as the result. Developing an algo-
rithm for DNA computing requires a lot of experience in molecular biology
and computer science. Instead of using binary digits (0 and 1) in an electronic
computer, information or data is stored in DNA computing-based computer
by using the four genetic alphabets, namely A, T, G, and C. The ability to
synthesize short DNA sequences artificially allows these sequences to be
used as inputs in algorithms. DNA computing-based simulations use one
billion times less energy than a traditional machine. DNA computing also
supports storing data in less than one trillion time volume than traditional
compact discs. Furthermore, DNA computing is extremely parallel as
chemical reactions by using billions and trillions of DNA molecules can
be executed at the same time.
In 1994, L. M. Adleman has first used DNA computing to solve the HPP,
a renowned NP-complete problem [3]. An NP-complete problem cannot
be solved by a deterministic algorithm in polynomial time. A directed graph
G(V, E) having V and E as a set of vertices and edges, respectively with
Vi and Vo as designated vertices such that Vi, Vo  V is said to have a
Hamiltonian path, if and only if there exists a path between Vi and Vo that
traverse every vertex only once [4]. Adleman has solved the HPP for a
Graph G having 7 vertices with Vi ¼ V0 and Vo ¼ V6 and uses 50 pmol of each
nucleotide for the DNA molecule encoding. The non-deterministic algorithm
using DNA computing suggested by Adleman to solve the HPP is as follows.
18 Tarun Kumar and Suyel Namasudra

Step 1: The first step encodes the vertices and edges of the graph into
DNA bases and generates a random path through the graph. For every
vertex u  V, a random strand consisting of 20 nucleotides is synthesized.
Su ¼ AATGCCTAGT
|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}TCCTAATGGC
|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}
au bu

Sv ¼ TAGGACTAGG
|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}CTTCAAGTAT
|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}
av bv

Sw ¼ CCGTATGATC
|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}CGTACGGCTT
|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
aw bw

Here, Su, Sv and Sw represent the strand for vertex u, v and w, respec-
tively. For every edge u ! v  E, an opportune strand is synthesized by
concatenating the 30 end of Su and 50 end of Sv, i.e., bu and av, respec-
tively. In the similar way, the edge v ! w can be synthesized by
concatenating the 30 end of Sv and 50 end of Sw, i.e., bv and aw, respec-
tively as given below:
Su!v ¼ TCCTAATGGC
|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}TAGGACTAGG
|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
bu av

Sv!w ¼ CTTCAAGTAT
|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}CGTACGGCTT
|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
bv aw

The complement of each vertex v  V is generated by exchanging the


base pairs to get the edges and it is calculated as:

Sv ¼ ATCCTGATCCGAAGTTCATA
where Sv is the complementary edge of Sv. The ligation reaction takes
place on encoded vertex and edges to generate DNA strands that give
a random path through the graph.
Step 2: The DNA sequences generated in step 1 are expanded by PCR
with S0 and S6 as primers, a short sequence with complementary bases.
This step gives the routes from the starting to the final specified node of
the graph.
Step 3: In this step, the length is calculated on the result of step 2 using the
agarose gel to select the route with exact 7 nodes, which implies
20 ∗ 7 ¼ 140 basepair. This step selects the routes that visit the required
number of nodes.
Introduction to DNA computing 19

Step 4: The fourth step is used to verify the traversing of each node of the
graph. The strand Su!v corresponding to the vertex v is selected as it
belongs to the route incoming for the vertex v.
Step 5: Finally, the Hamilton path is calculated in this step. If a path is
generated in step 1, all the steps are skipped. Otherwise, steps 2 to 4
are executed.
The aforementioned steps use DNA computing to get the Hamiltonian
path. Adleman also observes that the time is increased linearly with the size
of the graph. Thus, Adleman has proved high parallelism capabilities of
DNA computing by solving NP-complete problem, i.e., HPP, in polyno-
mial time, which needs exponential time by the traditional system [3].
Although the HPP is a NP-complete problem, but, Adleman’s experi-
ment doesn’t ensure that DNA computing can be used to solve any other
NP-complete problem. Adleman’s DNA computing-based design is trivial
as when the number of nodes is increased, the number of DNA
computing-based components is also increased. So, biochemists and com-
puter scientists have started to explore this new field. Lipton has extended
Adleman’s model in such a way that biological computers can change the
way of computation [26]. At first, Lipton has generalized Adleman’s solution
and tried to solve the satisfiability problem, i.e., a basic NP-complete prob-
lem. The main contribution of Lipton’s solution is the way to encode a
binary string into DNA strands. However, there are some major concerns
in DNA computing:
(1) Its computational model is mostly based on molecular techniques to
solve any particular issue. The variety of problems leads to discrepancies
in computing systems and there is also no coding model and standard in
DNA computing.
(2) DNA computing is prone to errors. These are mainly randomly gener-
ated DNA sequences and the chances to occur error increase in all the
experimental stages.
(3) There are no mathematical models to prove the concept of DNA com-
puting, which is very critical in many sensitive computational problems.

2.2.1 History of DNA computing


The concept of DNA gets popularity, when Watson and Crick have discov-
ered the double-helix structure of DNA in 1953. Then, scientists and
researchers have gained a better understanding of DNA replication and
genetic regulation of cellular activities. During that time, researchers were
much focused on the computing capability of living systems. The primary
20 Tarun Kumar and Suyel Namasudra

objective was to implement the concept of living systems in computing


devices. For example, wireless automatic system, genetic algorithm, neural
network, etc.
In the early 1960s, Richard Feynman has invented the first molecular
computer [27]. The concept of DNA was first used for computing in
1994 by L. M. Adleman. Adleman has demonstrated that DNA can be used
for executing computations in massively parallel manners. By using the four
DNA bases, Adleman has encoded a classic “hard” problem called Traveling
Salesman Problem into DNA strands and used the property of molecular
biology for finding the answer. Then, Adleman has also demonstrated
the uses of DNA in computing to solve various problems like the 3-SAT
problem, the 0/1 Knapsack Problem, and many more. After that, many
Turing machines were proved by using the concept of original Adleman’s
experiments.
In 1995, Baum [28] has introduced the idea of DNA computing-based
storage. He has shown that a tiny volume of DNA strands can be used to
store a large amount of data due to the ultra-high density of DNA. In
1997, Ouyang et al. [29] have proposed an experimental explanation for
the “maximum clique” dilemma based on molecular biology. In 1997,
researchers from the University of Rochester uses DNA computing to create
logic gates.
Liu et al. [30] have developed a DNA computing-based model in 2000,
where a multi-based encoding strategy is used in a surface-based DNA mea-
surement approach [31]. In 2002, researchers from the Weizmann Institute
of Science have demonstrated a programmable computing system based on
enzymes and DNA. Reif et al. [32] have first introduced the idea to use
DNA computing in robotics. They have utilized molecular biology to gen-
erate energy for the walker. As it was the first demonstration, an extensive
variety of DNA computing-based walkers are demonstrated by many
researchers. This machine is capable of diagnosing cancer in a cell.
In 2013, the first biological transistor was created based on DNA com-
puting [33]. Nowadays, DNA computing is used in many fields, including
cryptography, steganography, authentication, and many more [34–35]. As
four DNA bases are used in DNA computing, it supports much randomness
in key generation and data encryption phases in cryptography.

2.2.2 Advantages of DNA computing


DNA computing has many advantages over traditional computing. As DNA
strands can hold a huge amount of data, it supports solving decomposable
Introduction to DNA computing 21

problems in less time. In this subsection, some advantages of DNA comput-


ing are discussed in detail [36].
(1) Performance: The performance rate of DNA strands can be exponen-
tially improved by running millions of operations simultaneously.
Adleman’s experiment was executed with 1014 operations/s by using
DNA computing, which is equal to 100 Teraflops or 100 trillion
floating-point operations/s. At the same time, the world’s fastest super-
computer has the capability of 35.8 teraflops. It is one of the best solu-
tions for a problem that involves a large number of calculations, such as
clustering optimization, scheduling problem, etc.
(2) Parallel Processing: The ability to execute a great number of tasks in a
parallel manner is a major benefit of DNA computing over classical
computing. This countless degree of parallelism benefits a wide range
of applications like cryptography, steganography, etc. DNA computers’
massively parallel computing capability can enhance the speed of the
operation. A group of 1018 DNA strands can enhance the speed of
the advanced supercomputers by 10,000 times of it [36].
(3) Storage Capability: Today, knowledge is not only in the form of
words or documents, but also in the form of photographs, video, and
other formats, and all of these require a large amount of storage space.
DNA computing can be one of the best solutions for addressing today’s
data storage issue. As compared to digital data storage, DNA can store a
lot of information in a limited amount of space. Standard storage media,
including videotapes require 1012 cubic nanometer of space for storing
one bit of information. However, DNA computing requires 1 cubic
nanometer for one bit. 1 cm3 of DNA strand can hold more data than
a trillion of compact discs. This is due to the DNA molecule’s data den-
sity that is 18 Mbits per inch. A few gram of DNA can store all the data
of the world.
(4) Minimal Power Consumption: The computers based on DNA
computing are small and light. As DNA computers are mainly based
on basic biological operations and chemical bond does not require
any electricity, these need low power or electricity. Here, power is only
required to prevent DNA denaturation.

2.2.3 Disadvantages of DNA computing


Along with the advantages of DNA computing, it also has some disadvan-
tages [36]. As the size of a DNA strand is long, there can be errors in DNA
computing, when nucleotides are paired with each other. Here, all the
22 Tarun Kumar and Suyel Namasudra

processes require human intervention. There are some other major disad-
vantages of DNA computing as mentioned below:
(1) Requires High Memory: In DNA computing, a large amount of
memory is required to generate a set of solutions for relatively simple
problems. Even though DNA can store or save a trillion times more
data, if a significantly large problem needs to be resolved, the technique
by which the data is handled requires a large number of DNA strands.
(2) Accuracy: The synthesis of DNA is prone to errors like mismatching
pairs, and it is mainly dependent on the accuracy or the correctness of
the involved enzymes. The likelihood of error increases exponentially
and must limit the number of operations until the risk is higher than
producing the correct outcome. Due to this, DNA computing cannot
be applied in sensitive applications.
(3) Resource-Intensive: All the phases of parallel operations must require
time in days or hours determined by mechanical or human intervention
in many steps. As DNA strands are made for a particular problem, for
each and every new problem, a new DNA strand should be created
that is more critical and time-consuming. Because of the vast parallelism
of DNA computing, algorithms can be performed in polynomial
time. However, they are very limited to applying for small instances
of the problem as they need the creation of an unobstructed space of
solution.
(4) Non-Replaceable: Traditional computers cannot be replaced by
DNA computers at this time due to their high cost. In addition,
DNA computers are not flexible, as well as are not easily programmable.
Here, users cannot just sit with a familiar keyboard and start typing for
programming [37]. They must need to understand all the basic details of
molecular biology, DNA computing, hardware, and computer pro-
gramming language.

2.3 Operations of DNA computing


As discussed earlier, DNA computing utilizes four DNA bases, i.e., A, C, G,
and T for computation. Here, the Watson-Crick complementary pair-rule is
also used in which A is paired with T and C is paired with G. However,
anyone can design any other complementary pair-rule [15]. In this subsec-
tion, many operations of DNA computing are discussed in detail.
(1) DNA Synthesis: In reality, it is one of the most fundamental biological
operations of DNA computing. The solid-state DNA synthesis method
Introduction to DNA computing 23

is based on the binding of the first nucleotide to a solid support followed


by the step-by-step addition of subsequent nucleotides in a reactant
solution from 30 to 50 direction. One drawback of this operation is that
a laboratory method can produce only 20–25 strands [38].
(2) DNA Replication: It is accomplished by the polymerase chain reac-
tion, which includes the enzyme DNA polymerase. PCR replication
process involves a template, which is a single-stranded guiding DNA,
and a primer, which is a template annealed oligonucleotide. The primer
is needed to start the polymerase enzyme’s synthesis reaction. By
sequentially adding the nucleotides at primer’s one end, the primer is
extended to 30 end until the required strand is acquired which begins
with a primer and complements the template. Here, the primer is
extended only in the 50 to 30 direction.
(3) Short by Length: A technique known as gel electrophoresis is used to
shorten the length of a DNA sequence [38]. Here, negatively charged
DNA molecules are positioned in “wells”, i.e., on one side of the poly-
acrylamide gel. The gel is then passed into an electric current with the
negative pole on one side of the wells and the positive pole on another
side. The DNA molecule is attracted to the positive pole, and the larger
molecules move through the gel more slowly. After an interval, mol-
ecules are separated into discrete bands based on their size.
(4) Separating DNA Sequence: This operation supports researchers to
remove a DNA sequence from the solution that contains the desired
sequence or DNA strand. This operation is executed by generating
the DNA sequence, whose complement is the desired DNA sequence.
Then, a magnetic substance is used to attach the newly generated DNA
strand with it. This magnetic substance is used for extracting the strands
after the annealing process.
(5) ASCII Rule: Although this operation is not directly associated with
DNA computing, it is considered one of the important operations.
The main purpose of this operation is to confuse the attackers and
malicious users for unauthorized access to any confidential data. In this
operation, an 8-bit binary value of the plaintext is converted into
corresponding ASCII values ranging from 0 to 255. Then, this decimal
value is again converted into their corresponding binary values for fur-
ther processes of DNA computing. As the plaintext of data is converted
into another binary form before some other operations of DNA com-
puting, it supports massive randomness. Table 1 shows an example of
the ASCII rule.
24 Tarun Kumar and Suyel Namasudra

Table 1 ASCII rule.


Binary value ASCII Binary value
00000000 12 00001100
00000001 220 11,011,100
. . .
. . .
. . .
11,111,111 10 00001010

Table 2 2-Bit binary encoding rule.


00 01 10 11
A T G C
A T C G
. . . .
. . . .
. . . .
G T C A

(6) DNA Encoding: In DNA encoding, the binary value of any data or
plaintext is converted into DNA bases by using DNA computing. It is
one of the popular operations of DNA computing. For example, a
DNA encoding rule can be defined as 00-A, 01-T, 10-G, and 11-C.
This implies if there are binary values as 00, 01, 10, and 11 in the plain-
text, they can be converted into A, T, G, and C, respectively. It does
not matter in which manner or sequence the binary values are repre-
sented in the plaintext. Anyone can assign any rule to convert the binary
values into DNA bases. Thus, a DNA encoding rule can be applied to
the binary values of any plaintext. Table 2 shows an example of the
DNA encoding rule.
(7) DNA Complementary Rule: The complementary pair-rule assigns a
DNA base to each nucleotide base. To make an algorithm more
complex, a researcher can create his or her own complementary rule
by using the DNA bases. As there are four DNA bases, i.e., A, C, G,
and T, the complementary pair-rule can be assigned as A-T and
Introduction to DNA computing 25

Table 3 DNA XOR rule.


XOR A T G C
A A G C T
T C T G A
G G C T A
C T G C A

C-G, which means if there is A, it can be replaced with T and vice versa.
As there are 4 DNA bases, there can be 4!, i.e., 24 encoding rules to
transform binary values of any data into DNA bases. After converting
the binary values into DNA bases, DNA bases can be converted into
another DNA bases by using the complementary pair-rule [15].
For example, if the binary value of any data is 00011011, it can be
converted into DNA bases as ATGC by using any DNA encoding rule.
Here, A, T, G, and C are 00, 01, 10, and 11, respectively. Then, ATGC
can be converted into TACG by using the complementary pair rule.
This rule is popular in DNA cryptography.
(8) DNA XOR Rule: Because of the recent advancement of DNA com-
puting, many researchers are nowadays using the DNA XOR rule in
related fields. In DNA computing, the DNA XOR operation is iden-
tical to the XOR operation. Table 3 shows a DNA computing-based
XOR rule.
For example, if there is any DNA sequence ATGCCGTA and a key or DNA
strand CTAGTAGC, it can be converted into another DNA sequence
TTGCGGGT by using the DNA XOR rule of Table 3. Then, the original
DNA sequence can be retrieved by using the newly generated DNA
sequence, i.e., TTGCGGGT and original key CTAGTAGC. To retrieve
the original DNA sequence, again the same DNA XOR rule must be used.

3. Applications of DNA computing


Nowadays, DNA computing is used in many sectors like DNA chips,
cryptography, data storage, computation, and many more. In this section,
many applications of DNA computing are discussed in detail.
(1) DNA Chip: The recent technology has been drastically changed due
to the use of DNA chips and microarrays [39]. DNA chip technology
uses microarrays of molecules immobilized on any solid surface for
26 Tarun Kumar and Suyel Namasudra

biochemical analysis. Microarrays are used by scientists and researchers


for expression analysis, genotyping of the genome, DNA resequencing,
and polymorphism detection. DNA computing-based chips are like sil-
icon chips that are mainly used for data storage in the form of DNA
sequences [40]. These chips are mainly made up of a huge amount
of embedded spots on the solid surface in which spot keeps probes.
Here, probes can be referred to as a small DNA sequence. In each spot,
when a DNA sequence is tied with these probes, data or files are elec-
tronically calculated on the basis of the ratio of combining a probe with
the DNA sequence. DNA computing is attractive to store data because
a large number of data items or files can be stored within the condensed
volume. 1021 DNA bases are there in 1 g DNA that implies approx. 700
TB data can be stored in 1 g of DNA [41]. Therefore, all the files of the
universe can be stored in a few grams of DNA. The speeds of reading
and writing in DNA computing are high that also motivates many
researchers or other professionals to use DNA computing.
Nowadays, much emphasis is given by the manufacturers for devel-
oping small biochips with high data handling capacity. These types of
biochips can be highly beneficial in many areas, including cryptogra-
phy. This cryptographic technique consists of the following steps:
Step 1: At first, a group of specific probes is considered as the
encryption key, and a collection of corresponding probes containing
complementary DNA sequences is considered as the decryption key.
Both the keys are generated by a key generator. The encryption key
and decryption key are sent to the sender and receiver, respectively,
securely.
Step 2: In the second step, the plaintext of a file or data is trans-
formed into its binary form. Then, this binary form is embedded into
a DNA chip in terms of DNA sequence. It must be noted that no one is
able to decrypt the ciphertext without the decryption key.
Step 3: The receiver decrypts the ciphertext, i.e., cipher DNA using
the complementary DNA sequence or decryption key. Then, the
receiver can use an appropriate software to retrieve the original plaintext.
(2) Genetic Programming: A Genetic Algorithm (GA) can be defined as
a soft computing technique based on natural evolution. GA replicates
the natural selection process in which the fittest of individuals are
selected to reproduce offspring of the subsequent generation. It is
mainly used to search for optimal values in each successive generation.
The major advantage of DNA-GA approaches is the combination of the
Introduction to DNA computing 27

high storage capacity and massive parallelism of DNA with the search
capability of GAs. In DNA computing, GA is one of the best solutions
to break the limit of the brute-force method. As 1 g of DNA has 1021
DNA bases, the results of each generation of GA can be easily encoded
in DNA bases by using the binary form. A large population can carry a
large range of genetic diversity. Thus, in a few generations, it can gen-
erate high-fitness chromosomes, and the size of the search space is effec-
tively reduced. In addition, if there is an experiment in vitro operations
on DNA, it integrally includes errors. These errors can be ignored by
executing GAs than by implementing deterministic algorithms. Here,
errors can be referred to as contributing factors in GA.
In 1997, Deaton et al. [42] have proposed a GA based on DNA
computing for efficient encoding. Yoshikawa et al. [43] have combined
pseudo-bacterial GA with the DNA encoding method. In 1999, Chen
et al. [44] have implemented DNA computing-based GA in a labora-
tory to solve some problems, such as the royal road, Max 1 s, and cold
war problems. Wood and Chen [45] have proposed a DNA strand
design that is well-matched for the royal road problem by using a
GA. They have used the vitro evolution started with a haphazard
population-based on DNA bases. In 2004, Yuan et al. [46] have
designed a DNA computing-based GA to solve the maximal clique
problem that is accomplished to produce an accurate solution within
a few rounds. The simulation of Yuan and Chen indicates that the time
required for their scheme is linear with the number of vertices of the
particular network.
(3) Cryptography: Information security plays a major role in several areas
like military relations, confidential business, financial institutions, and
so on. In conventional systems, customers or users are restricted to their
own domain. In today’s advanced technologies, users are much inter-
ested to store and access data from outside of their domain. As a result,
data security has become extremely important due to the involvement
of numerous attackers and hackers, who are constantly attempting to
hack users’ personal and sensitive data or file for their own gain or to
generate revenue. They often use their fake data to replace the original
data. Thus, users’ data may again face data security problems.
Several cryptographic techniques are already proposed by researchers
for data transfer. Identity-Based Encryption (IBE) was first introduced by
Shamir [47]. Here, the data sender defines an identity that should be mat-
ched by the recipient to decrypt the data. After some years, a novel
28 Tarun Kumar and Suyel Namasudra

model, namely fuzzy identity-based encryption has been suggested in


which a user’s identity is represented with several descriptive attributes
[48]. Sahai and Waters [49] have proposed Attribute-Based Encryption
(ABE) for providing access to complex data. ABE is divided into two
types: ciphertext policy-based ABE and key policy-based ABE [50].
Most of the conventional approaches are mainly based on complex math-
ematical equations in which researchers mainly focus on increasing the
complexity through the modification of the equations [51]. To improve
data security, some conventional approaches encrypt data, and then,
insert the encrypted data into various multimedia data or files as the cover
media, such as image, video, audio, etc. [52]. Hence, malicious users or
hackers are not able to find the required data to breach the cryptosystems.
DNA cryptography is currently one of the fastest-growing technol-
ogies based on the concept of DNA computing. In DNA cryptography,
DNA computing is used to encrypt the data to achieve a strong security
mechanism that prevents unauthorized, attackers and malicious users
from reading the original data content. Here, DNA bases, namely A,
T, G, and C are used for data encryption. Many DNA computing-based
cryptographic algorithms have been proposed, including asymmetric
and symmetric key cryptosystems, triple stage DNA cryptography,
DNA-based chaotic computation, and many more. For a strong DNA-
based encryption scheme, there are a few conditions as mentioned below:
In [53], a novel access control model is introduced in which decimal
encoding rule, complementary pair-rule, ASCII values, and DNA
encoding rule are used for data encryption. In the coupled map
lattice-based scheme [54], authors have used coupled map lattice and
DNA strands for improving data security. Along with DNA cryptog-
raphy, this scheme uses the SHA-256 algorithm. In 2019, Wang
et al. [55] have proposed a recombinant DNA technique for data
security. A bio-inspired cryptosystem was proposed by Reddy et al.
[56] based on DNA computing, and central dogma molecular biology
is used to encrypt and decrypt any data. This scheme consists of three
main phases: (1) key generation (2) encryption, and (3) decryption.
However, this scheme does not support much randomness and takes
much time for data encryption.
(4) Polymerase Chain Reaction: As discussed earlier, PCR is much
effective in producing a large number of copies from a small amount
of DNA. Here, primer is required to design an amplification process.
Nowadays, PCR is also used for cryptography by making the
Introduction to DNA computing 29

encryption key or secret key compound. This means that the encryp-
tion key contains both the PCR primer pairs and public key, while the
decryption key contains both the complementary primer pairs and pri-
vate key [57]. To communicate any data, two primers are shared
between the sender and receiver in this technique through a safe chan-
nel before starting the data encryption processes.
In the process of data encryption, algorithms like RSA, Advanced
Encryption Standard (AES), etc., may be used as the preprocessing stage.
Then, a coding rule converts the ciphertext into the corresponding DNA
strand. Thus, a new ciphertext is generated. As shown in Fig. 10, DNA
cipher refers to the ciphertext of the data in the form of a DNA sequence
and plaintext refers to the binary form of the data. The DNA cipher
is surrounded by secret primers, which are then combined with
other unknown DNA strands. PCR technique is used to generate these
unknown DNA strands. The sender then sends the mixed DNA to the
sender. The receiver obtains the block DNA cipher by performing PCR
with the aid of the secret primer, and then, reverses the entire data
encryption operation. No one can retrieve the DNA cipher without
any prior knowledge of two primers.
The above-mentioned PCR approach has several implications, such as Cui
et al. [58] have suggested a data encryption approach based on PCR ampli-
fication and DNA coding. Tanaka et al. [59] have introduced a public key
scheme based on a DNA computing-based one-way function. Yamamoto
et al. [60] have proposed a novel encryption scheme that uses both modern
algorithms and molecular techniques to provide two-level security by using
the PCR-based large-scale DNA space. If one security level is breached, the
system is kept secure by using another level. However, one of the major
issues of this PCR-based security is the exchanging of the secret key between
the sender and recipient.

4. Retrieve plaintext using private


key of the pre-processing algorithm 3. Retrieve ciphertext by using
DNA based decryption key

Plaintext Ciphertext DNA Cipher

1. Ciphertext is generated by 2. Ciphertext is converted


using a pre-processing into DNA sequence
algorithm
Fig. 10 PCR based cryptography.
30 Tarun Kumar and Suyel Namasudra

(5) DNA-based Steganography: In DNA-based steganography, one or


more DNA sequences are used as inputs in terms of the plaintext of the
data. Randomly generated DNA-based secret key strands are attached
to the DNA input sequence. The resulting plaintext DNA sequence is
then hidden by mixing it with a large number of randomly generated
DNA strands. In the data decryption phase, the secret key is used to
decode the DNA strands by using a collection of known DNA sepa-
ration methods. Such plaintext messages or DNA sequences can be
separated by using the hybridization process with the help of the secret
key strand’s complements. As the DNA-based steganography approach
minimizes the total cost of the encryption phase, it is efficient. But, it is
not secured for the applications, which are based on statistical analysis.
Risca [61] has used standard biological protocols to propose a
DNA-based steganographic technique. This approach encrypts the
information by encoding it as a DNA sequence flanked by two sepa-
rate private primers. Gehani et al. [62] have suggested an improvement
technique by distinguishing the distracted DNA stands from the prob-
ability distribution of the plaintext of the file or data or message. Roy
et al. [63] have proposed a DNA computing-based steganographic
scheme in which confidential data are embedded into a document, i.e.,
a Word document. Here, at first, the plaintext is embedded into
a DNA strand, and then, this DNA strand is again encrypted by using
DNA computing. In 2015, Gupta and Singh [64] have proposed
a DNA computing-based data hiding technique by using two main
processes: (1) substitution process, and (2) central dogma. They have
converted data into a DNA sequence. Then, it is transformed into a
ribonucleic acid followed by transforming into a protein sequence.
Tuncer and Avci [65] have proposed a novel probabilistic secret shar-
ing scheme by utilizing DNA XOR operation. In this scheme, data
is embedded into green, red, and blue channels of a random cover
image. However, this scheme is vulnerable to man-in-the-middle
attacks. Wang et al. [66] have proposed a data hiding scheme in 2017
by using a randomly generated DNA XOR rule. This scheme con-
sumes much time, and it is mainly based on two algorithms: (1) embed-
ding watermarking, and (2) extracting watermarking.
(6) DNA Fingerprint: DNA computing-based fingerprint involves
detecting variations of DNA bases in a particular region of the
DNA sequence as a small change of DNA bases is replicated several
times in the actual sequences. During density gradient centrifugation,
Introduction to DNA computing 31

these repeated DNA peaks are isolated from bulk genomic DNA as
distinct peaks. It is a lab procedure, which is mainly used to solve a
criminal case. Here, the DNA sample of the crime scene is matched
with a suspect’s DNA sample. If both the DNA samples are the same,
then in most of the cases, the suspect is considered guilty for the crime.
Otherwise, an investigation is conducted in some other aspects. DNA
fingerprinting is also used for paternity determination [67].
(7) Boolean Circuit: The Boolean circuit is considered an important
Turing-based correspondent model of parallel computation. Here, a
Boolean circuit that is n input bounded fan in can be visualized as
an acyclic directed graph, S. There are mainly two types of node in
the graph, namely input node with zero in-degree and gate node with
at most two in-degree. Each input node is connected with a Boolean
variable from the input set and each gate node is connected with a
Boolean function. The collection of all such functions is known as
circuit basis.
Researchers have developed DNA computing-based Boolean cir-
cuits. The first DNA computing-based Boolean circuit was proposed
by Ogihara and Ray [5]. They have implemented a real-time simula-
tion of the depth of the Boolean circuit. In [68], it has been estimated
that the time complexity of the Boolean circuit must be proportional
to the circuit’s size. DNA computing-based Boolean circuit is based
on the polymerase chain reaction. Here, the logical gates consist of
one DNA strand only, which significantly decreases leakage reactions
and restoration steps of signal such that the performance of the circuit
is improved. A large logical circuit can be built from the gates by sim-
ple cascades. In particular, it has been shown a short and compact logic
circuit that calculates the four-bit input square root function [69].
Here, R1, R2, R3, and R4, are the input bits and the total number
of output bits is 8. The first four output bits represent one position
of the square root, while the next four bits represent the first decimal
position.
(8) Clustering: Clustering deals with the construction of meaningful
relationships in a high-dimensional complex dataset. DNA computing
can be used to develop clustering techniques. This technique is suit-
able, when dealing with huge datasets and there are heterogeneous
characters in the dataset. Nowadays, clustering is used in many appli-
cation areas, such as facility allocation [70], signal processing [71],
astronomy [72], medical data analysis [73], and many more. The main
32 Tarun Kumar and Suyel Namasudra

challenge of clustering is the combinatorial explosion of alternatives,


which must be carefully considered. This clustering’s combinatorial
efforts are enormous in essence. There must be a computation and
process of all combinations of data points based on all possible numbers
of clusters for constructing an optimal resolution for the clustering
object. For a given large number of calculations, the silicon-based
computer may not be very efficient, when dealing with huge problems
with high dimensionality.
DNA computing-based clustering technique helps to reduce the
complexity of time by using the massively parallel computation of
DNA computing. Here, all the possible solutions can be easily explored
all at once in a parallel manner. A silicon-based computer solves a prob-
lem by assigning tasks to numerous processes. Thus, the allocation
technique results in sequential or linear processing. On the other hand,
DNA computing-based technique can solve such problems by using a
single parallel process. Interestingly, this novel technology also offers to
manage energy efficiently. Each DNA computing-based clustering
algorithm is based on the certain underlying rationale and supports a
positive understanding of the data. In addition, such an algorithm also
derives from its fundamental optimization technique, computational
enhancements, and validation tools. Many researchers also have pro-
posed clustering techniques by using fuzzy clustering [74] and curve
techniques [75].
(9) DNA Computer: DNA computer uses DNA bases, namely A, T, G,
and C, as the memory units. Here, recombinant DNA techniques are
used for executing fundamental operations. Test tubes are used in
DNA computers for computations. Sometimes, a glass-coated in
24 K gold is also used for computations. In a DNA computer, both
input and output are DNA strands in which information is encoded,
and a program is executed in terms of many biochemical operations
that have many effects, such as synthesizing, modifying, extracting,
and cloning the strands of DNA. The main difference between DNA
computers and traditional computers is the storage capacity. Due to
the storage capacity of DNA, DNA computers support a large memory
by using DNA bases, namely G, A, C, and T. On the other hand, elec-
tronic computers support comparatively less storage capacity by using
0 and 1.
(10) Scheduling: Nowadays, DNA computing is used for job scheduling
problem algorithms due to its unique properties, such as vast parallel,
Introduction to DNA computing 33

tiny parts, and organic edges. The job scheduling problem can be easily
managed by a standard computer or by a human. In [76], Ibrahim et al.
have proposed a DNA computing-based algorithm to solve the job
scheduling problem. They have added some operations to the evolu-
tionary operations to achieve better performance. Tian et al. [77] have
proposed another DNA computing-based algorithm in which an
encoding scheme for the job shop scheduling problem is developed
along with some novel DNA computing-based operations. Here,
all possible solutions are made after an initial result or solution is con-
structed. Then, DNA computing-based operations are utilized for
finding the optimal schedule. The complexity of DNA computing-
based algorithm of Tian et al. is O(n2). In addition, the final strand’s
length of the optimal schedule is always within the appropriate range.

4. Conclusions and future works


In this digital era, almost all transactions are executed over the internet.
As there are numerous attackers and hackers, users’ confidential data face
security issues. DNA computing can be used to improve data security.
DNA computing is a form of molecular computing in which DNA bases
(A, T, G, and C) are used for encoding and processing. However, DNA
computing is still in its early phase. In this chapter, the fundamentals of
human DNA, such as structure, polymerase chain reaction, history, etc.,
were presented. The basics of DNA computing and its importance, includ-
ing its advantages, disadvantages, and operations of DNA computing were
discussed. Moreover, many applications of DNA computing in different
fields, such as DNA chips, genetic programming, data storage, cryptography,
and many more, are presented in detail.
As DNA computing is very emerging, it can be used in many advanced
technologies, such as cloud computing, big data, quantum computing, and
many more, to improve efficiency and performance. In a cloud computing
environment, data owners store their data on the cloud server. DNA
computing-based cryptography can be used to develop a novel model for
the cloud computing environment to improve data security. In the future,
DNA computing can also be used to store big data by using DNA bases.
Moreover, there is a huge scope to develop DNA computers for fast
processing and to use the concept of DNA computing in quantum comput-
ing to achieve better performance.
34 Tarun Kumar and Suyel Namasudra

References
[1] L.M. Adleman, Computing with DNA, Sci. Am. 279 (2) (1998) 54–61.
[2] J.D. Watson, T.A. Baker, S.P. Bell, A. Gann, M. Levine, R. Losick, Molecular Biology
of the Gene (International Ed.), Pearson Education, 2004.
[3] L.M. Adleman, Molecular computation of solutions to combinatorial problems, Science
266 (5187) (1994) 1021–1024.
[4] D.I. Lewin, DNA computing, Comput. Sci. Eng. 4 (3) (2002) 5–8.
[5] M. Ogihara, A. Ray, Simulating Boolean circuits on a DNA computer, Algorithmica
25 (2) (1999) 239–250.
[6] S. Tagore, S. Bhattacharya, M. Islam, M.L. Islam, DNA computation: application and
perspectives, J. Proteom. Bioinform. 3 (7) (2010).
[7] D. Boneh, C. Dunworth, R.J. Lipton, J. Sgall, Making DNA computers error resistant,
DNA Based Comput. II 44 (1996) 163–170.
[8] S. Namasudra, G.C. Deka, Introduction of DNA computing in cryptography, in:
S. Namasudra, G.C. Deka (Eds.), Advances of DNA computing in cryptography,
Taylor & Francis, 2018, pp. 17–34.
[9] R. Deaton, J. Chen, J.-W. Kim, M.H. Garzon, D.H. Wood, Test tube selection of
large independent sets of DNA oligonucleotides, in: Nanotechnology: Science and
Computation, Springer, 2006, pp. 147–161.
[10] G.G. Owenson, M. Amos, D.A. Hodgson, A. Gibbons, DNA-based logic, Soft
Comput. 5 (2) (2001) 102–105.
[11] G. Paun, G. Rozenberg, A. Salomaa, DNA Computing: New Computing Paradigms,
Springer, 2005.
[12] C. Bancroft, T. Bowler, B. Bloom, C.T. Clelland, Long-term storage of information in
DNA, Science 293 (5536) (2001) 1763.
[13] S. Namasudra, G.C. Deka, R. Bali, Applications and future trends of DNA computing,
in: S. Namasudra, G.C. Deka (Eds.), Advances of DNA Computing in Cryptography,
Taylor & Francis, 2018, pp. 181–192.
[14] P. Pavithran, S. Mathew, S. Namasudra, G. Srivastava, A novel cryptosystem based on
DNA cryptography, hyperchaotic systems and a randomly generated Moore machine
for cyber physical systems, Comput. Commun. 188 (2022) 1–12.
[15] S. Namasudra, Fast and secure data accessing by using DNA computing for the cloud
environment, IEEE Trans. Serv. Comput. (2020), https://doi.org/10.1109/TSC.2020.
3046471.
[16] H. Dongming, D.-F. Zhou, Y. Sheng, Y. Shaoliang, Z. Luozhi, Z. Xin, Image encryp-
tion using exclusive-OR with DNA complementary rules and double random phase
encoding, Phys. Lett. A 383 (9) (2019) 915–922.
[17] L. Kari, R. Kitto, G. Thierrin, Codes, involutions, and DNA encodings, in: Formal and
Natural Computing, Springer, 2002, pp. 376–393.
[18] S. Shah,, D. Limbachiya, and M. K. Gupta, “DNACloud: A potential tool for storing
big data on DNA,” arXiv preprint arXiv:1310.6992, 2013.
[19] A.C. Patel, C.G. Joshi, Deoxyribonucleic acid as a tool for digital information storage:
an overview, Indian J. Vet. Sci. Biotechnol. 15 (1) (2019) 1–8.
[20] D.L. Nelson, A.L. Lehninger, M.M. Cox, Lehninger Principles of Biochemistry,
Macmillan, 2008.
[21] J.D. Watson, F.H.C. Crick, A structure of deoxyribose nucleic acid, Nature 171
(1953) 737–738, https://doi.org/10.1038/171737a0.
[22] L. Garibyan, N. Avashia, Research techniques made simple: polymerase chain reaction
(PCR), J. Invest. Dermatol. 133 (3) (2013), https://doi.org/10.1038/jid.2013.1.
[23] W.B. Coleman, G.J. Tsongalis, Laboratory approaches in molecular pathology—the
polymerase chain reaction, in: Diagnostic Molecular Pathology, Academic Press,
2017, pp. 15–23.
Introduction to DNA computing 35

[24] L. Gonick, M. Wheelis, The Cartoon Guide to Genetics, Harper Perennial, 1991.
[25] K. Drlica, Understanding DNA and gene cloning: a guide for the curious, in:
Understanding DNA and gene cloning: a guide for the curious, ed. 2, John Wiley &
Sons, 1992.
[26] R.J. Lipton, Using DNA to solve NP-complete problems, Science 268 (4) (1995)
542–545.
[27] R.P. Feynman, D. Gilbert, Miniaturization, Reinhold, New York, 1961, pp. 282–296.
[28] E.B. Baum, Building an associative memory vastly larger than the brain, Science 268
(5210) (1995) 583–585.
[29] Q. Ouyang, P.D. Kaplan, S. Liu, A. Libchaber, DNA solution of the maximal clique
problem, Science 278 (5337) (1997) 446–449.
[30] Q. Liu, L. Wang, A.G. Frutos, A.E. Condon, R.M. Corn, L.M. Smith, DNA comput-
ing on surfaces, Nature 403 (6766) (2000) 175–179.
[31] L.M. Smith, R.M. Corn, A.E. Condon, M.G. Lagally, A.G. Frutos, Q. Liu, A.J.
Thiel, A surface-based approach to DNA computation, J. Comput. Biol. 5 (2)
(1998) 255–267.
[32] J.H. Reif, T.H. LaBean, M. Pirrung, V.S. Rana, B. Guo, C. Kingsford, G.S. Wickham,
Experimental construction of very large scale DNA databases with associative search
capability, in: International Workshop on DNA-Based Computers, Springer, Berlin,
Heidelberg, 2001, pp. 231–247.
[33] M. Sarkar, P. Ghosal, S.P. Mohanty, Exploring the feasibility of a DNA computer:
design of an ALU using sticker-based DNA model, IEEE Trans. Nanobioscience
16 (6) (2017) 383–399.
[34] S. Namasudra, R. Chakraborty, A. Majumder and N. R. Moparthi, “Securing multi-
media by using DNA based encryption in the cloud computing environment”, in ACM
Trans. Multimedia Comput. Commun. Appl., vol. 16, no. 3s, pp. 1–19, 2020, DOI:
https://doi.org/10.1145/3392665.
[35] S. Namasudra, P. Roy, Time saving protocol for data accessing in cloud computing,
IET Commun. 11 (10) (2017) 1558–1565.
[36] https://cs.stanford.edu/people/eroberts/courses/soco/projects/2003-04/dna-computing/
evaluation.htm, 2003 [Accessed on 12 May, 2021].
[37] J. Watada, R.B.A. Bakar, DNA computing and its applications, in: 2008 Eighth
international conference on intelligent systems design and applications, vol. 2, 2008,
pp. 288–294.
[38] L. Kari, S. Seki, P. Sosı́k, DNA Computing: Foundations and Implications, Handbook
of Natural Computing, Springer, 2012, pp. 1073–1127.
[39] G. Ventimiglia, S. Petralia, Recent advances in DNA microarray technology: an over-
view on production strategies and detection methods, BioNano Sci. 3 (4) (2013)
428–450.
[40] E. Czeizler, E. Czeizler, A short survey on Watson-Crick automata, Bull. EATCS
88 (3) (2006) 104–119.
[41] G.Z. Cui, Y. Liu, X. Zhang, New direction of data storage: DNA molecular storage
technology, Comput. Eng. Appl. 42 (26) (2006) 29–32.
[42] R. Deaton, R.C. Murphy, J.A. Rose, M. Garzon, D.R. Franceschetti, S.E. Stevens, A
DNA based implementation of an evolutionary search for good encodings for DNA
computation, in: Proceedings of 1997 IEEE International Conference on
Evolutionary Computation (ICEC’97), 1997, pp. 267–271.
[43] T. Yoshikawa, T. Furuhashi, Y. Uchikawa, The effects of combination of DNA coding
method with pseudo-bacterial GA, in: Proceedings of 1997 IEEE International
Conference on Evolutionary Computation (ICEC’97), 1997, pp. 285–290.
[44] J. Chen, E. Antipov, B. Lemieux, W. Cedeño, D.H. Wood, DNA computing
implementing genetic algorithms, in: Evolution as Computation, 1999, pp. 39–49.
36 Tarun Kumar and Suyel Namasudra

[45] D.H. Wood, J. Chen, Physical separation of DNA according to royal road fitness,
in: Proceedings of the 1999 Congress on evolutionary computation-CEC99 (cat.
No. 99TH8406), vol. 2, 1999, pp. 1011–1016.
[46] Y. Li, C. Fang, Q. Ouyang, Genetic algorithm in DNA computing: a solution to the
maximal clique problem, Chin. Sci. Bull. 49 (9) (2004) 967–971.
[47] A. Shamir, Identity-based cryptosystems and signature schemes, in: Workshop on the
theory and application of cryptographic techniques, Springer, Berlin, Heidelberg,
1984, pp. 47–53.
[48] R. Yanil, G. Dawu, W. Shuozhong, Z. Xinpeng, New fuzzy identity-based encryption
in the standard model, Informatica 21 (3) (2010) 393–407.
[49] A. Sahai, B. Waters, Fuzzy identity-based encryption, in: Annual international confer-
ence on the theory and applications of cryptographic techniques, Springer, Berlin,
Heidelberg, 2005, pp. 457–473.
[50] J. Bethencourt, A. Sahai, B. Waters, Ciphertext-policy attribute-based encryption, in:
2007 IEEE symposium on security and privacy (SP’07), IEEE, 2007, pp. 321–334.
[51] C.C. Lin, N.-L. Hsueh, A lossless data hiding scheme based on three-pixel block
differences, Pattern Recognit. 41 (4) (2008) 1415–1425.
[52] S. Voloshynovskiy, T. Pun, J. Fridrich, F.P. González, N. Memon, Security of data
hiding technologies, Signal Process. 83 (10) (2003) 2065–2067.
[53] S. Namasudra, P. Roy, P. Vijayakumar, S. Audithan, B. Balamurugan, Time efficient
secure DNA based access control model for cloud computing environment, Future
Gener. Comput. Syst. 73 (2017) 90–105.
[54] W. Xingyuan, Y. Hou, S. Wang, R. Li, A new image encryption algorithm based on
CML and DNA sequence, IEEE Access 6 (2018) 62272–62285.
[55] Y. Wang, Q. Han, G. Cui, J. Sun, Hiding message based on DNA sequence and recom-
binant DNA technique, IEEE Trans. Nanotechnol. 18 (2019) 299–307.
[56] M.I. Reddy, A.P.S. Kumar, K.S. Reddy, A secured cryptographic system based on
DNA and a hybrid key generation approach, Biosystems 197 (2020).
[57] M. Roy, S. Chakraborty, K. Mali, S. Mitra, I. Mondal, R. Dawn, D. Das,
S. Chatterjee, A dual layer image encryption using polymerase chain reaction amplifi-
cation and dna encryption, in: 2019 International Conference on Opto-Electronics and
Applied Optics (Optronix), IEEE, 2019, pp. 1–4.
[58] G. Cui, L. Qin, Y. Wang, X. Zhang, An encryption scheme using DNA technology, in:
2008 3rd International Conference on Bio-Inspired Computing: Theories and
Applications, IEEE, 2008, pp. 37–42.
[59] K. Tanaka, A. Okamoto, I. Saito, Public-key system using DNA as a one-way function
for key distribution, Biosystems 81 (1) (2005) 25–29.
[60] M. Yamamoto, S. Kashiwamura, A. Ohuchi, M. Furukawa, Large-scale DNA memory
based on the nested PCR, Nat. Comput. 7 (3) (2008) 335–346.
[61] V.I. Risca, DNA-based steganography, Cryptologia 25 (1) (2001) 37–49.
[62] A. Gehani, T. LaBean, J. Reif, DNA-based cryptography, in: Aspects of Molecular
Computing, Springer, 2003, pp. 167–188.
[63] S. Roy, S. Sadhukhan, S. Sadhu, S.K. Bandyopadhyay, A novel approach towards
development of hybrid image steganography using DNA sequences, Indian J. Sci.
Technol. 8 (22) (2015) 1–7.
[64] R. Gupta, R.K. Singh, An improved substitution method for data encryption using
DNA sequence and CDMB, in: International Symposium on Security in Computing
and Communication, Springer, Cham, 2015, pp. 197–206.
[65] T. Tuncer, E. Avci, A reversible data hiding algorithm based on probabilistic
DNA-XOR secret sharing scheme for color images, Displays 41 (2016) 1–8.
[66] B. Wang, Y. Xie, S. Zhou, C. Zhou, X. Zheng, Reversible data hiding based on DNA com-
puting, Comput. Intell. Neurosci. 2017 (2017), https://doi.org/10.1155/2017/7276084.
Introduction to DNA computing 37

[67] P. Helminen, M-L. Lokki, C. Ehnholm, A. Jeffreys, and L. Peltonen, “Application


of DNA fingerprints to paternity determinations,” Lancet, vol. 331, no. 8585,
pp. 574–576, 1998.
[68] M. Amos, P.E. Dunne, DNA simulation of Boolean circuits, in: Proceeding of 3rd
Annual Genetic Programming Conference, 1997, pp. 679–683.
[69] Z. Ekmekc, B. Ulu, E. Ekmekci, A new molecular logic circuit with 4 bit input, Sens.
Actuators B 231 (2016) 655–658.
[70] R.B.A. Bakar, J. Watada, W. Pedrycz, A DNA computing approach to data clustering
based on mutual distance order, in: Proceedings 9th Czech-Japan Seminar, 2006,
pp. 139–145.
[71] K.L. Oehler, R.M. Gray, Combining image compression and classification using vector
quantization, in: IEEE transactions on pattern analysis and machine intelligence, 1995,
pp. 461–473. vol. 17, no. 5.
[72] K.L. Wagstaff, V.G. Laidler, Making the most of missing values: object clustering with
partial data in astronomy, in: P.L. Shopbell, et al. (Eds.), Astronomical data analysis soft-
ware and systems XIV, vol. 347, Astronomical Society of the Pacific Conference Series,
2005, pp. 172–176.
[73] J. Lin, D. Karakos, D. Demner-Fushman, S. Khudanpur, Generative content models
for structural analysis of medical abstracts, in: Proceedings of the BioNLP workshop
on Linking Natural Language Processing and Biology at HLT-NAACL, 2006,
pp. 65–72.
[74] W. Pedrycz, Knowledge-Based Clustering: From Data to Information Granules, John
Wiley & Sons, 2005.
[75] L. Cleju, P. Fr€anti, X. Wu, Clustering based on principal curve, in: Scandinavian
Conference on Image Analysis, Springer, Berlin, Heidelberg, 2005, pp. 872–881.
[76] G.J. Ibrahim, T.A. Rashid, A.T. Sadiq, Evolutionary DNA computing algorithm for
job scheduling problem, IETE J. Res. 64 (4) (2018) 514–527.
[77] X. Tian, X. Liu, H. Zhang, M. Sun, Y. Zhao, A DNA algorithm for the job shop sched-
uling problem based on the Adleman-Lipton model, Plos one 15 (12) (2020).

About the authors


Mr. Tarun Kumar is a research scholar in
the Department of Computer Science and
Engineering at the National Institute of
Technology Patna, Bihar, India. He is also
an Assistant Professor in the School of
Computing Science and Engineering, Gal-
gotias University, Greater Noida, India. He
has 16 years of experience in academics.
His research interests are Cloud Computing
and DNA Computing. He has published
several papers in peer reviewed journals and
international conferences. He also organized
and attended several workshops.
38 Tarun Kumar and Suyel Namasudra

Suyel Namasudra is an assistant professor


in the Department of Computer Science
and Engineering at the National Institute
of Technology Agartala, Tripura, India.
Before joining the National Institute of
Technology Agartala, Dr. Namasudra was
an assistant professor in the Department of
Computer Science and Engineering at the
National Institute of Technology Patna,
Bihar, India, and a post-doctorate fellow
at the International University of La Rioja
(UNIR), Spain. He has received Ph.D.
degree in Computer Science and Engineering from the National Institute
of Technology Silchar, Assam, India. His research interests include
DNA computing, blockchain technology, cloud computing, and IoT.
Dr. Namasudra has edited 4 books, 5 patents, and 60 publications in
conference proceedings, book chapters, and refereed journals like IEEE
TII, IEEE T-ITS, IEEE TSC, IEEE TCSS, ACM TOMM, ACM
TALLIP, FGCS, CAEE, and many more. He has served as a Lead Guest
Editor/Guest Editor in many reputed journals like ACM TOMM (ACM,
IF: 3.144), CAEE (Elsevier, IF: 3.818), CAIS (Springer, IF: 4.927), CMC
(Tech Science Press, IF: 3.772), Sensors (MDPI, IF: 3.576), and many more.
Dr. Namasudra has participated in many international conferences as an
Organizer and Session Chair. He is a member of IEEE, ACM, and IEI.
Dr. Namasudra has been featured in the list of the top 2% scientists in the
world in 2021 and 2022, and his h-index is 25.

You might also like