Thisisit
Thisisit
CHAPTER 1
INTRODUCTION
In the digital age, the need for secure communication and data protection has led to the
development of numerous steganographic techniques. Steganography, the art of hiding
information within other non-suspicious data, has traditionally been applied to digital media
such as images, audio, and video files. However, with the rapid advancements in
biotechnology and the expanding capabilities of DNA synthesis and sequencing, a novel and
intriguing method has emerged: DNA-based steganography.
DNA (deoxyribonucleic acid) is the fundamental molecule that carries the genetic
instructions for all known living organisms and many viruses. Its unique structure, composed
of four nucleotide bases—adenine (A), cytosine (C), guanine (G), and thymine (T)—forms
the basis for its ability to store vast amounts of information in a highly compact and efficient
manner. Each sequence of nucleotides encodes specific genetic information, which can be
read and interpreted by biological systems. This intrinsic information density and complexity
make DNA an ideal medium for steganographic purposes.
Error Rates
1. Secure Communication
2. Data Storage
1.3 Modules:
Encryption Module:
In this module user have access to enter his message and encrypt it into a secured DNA
sequence.
Decryption Module:
In this module the user can enter the secured DNA sequence and decrypt it to his original
message.
1.4 Objectives
The primary aim of this project is to analyze the “Pima Indian Diabetes Dataset” and
“Cleveland Heart Disease Dataset” and use Support Vector Machine, Naïve Bayes, and K-
Nearest Neighbors for prediction. The secondary aim is to develop a web
application that allows users to predict heart disease and diabetes utilizing the prediction
engine.
CHAPTER 2
LITERATURE SURVEY
As access to different types of data through the Internet become more popular, more
common, and faster, and the internet is an insecure channel, sending data in a secure and safe
channel becomes more challenging. Accordingly, sensitive information or secret messages
must be protected so that only the authorized receiver can retrieve it. There are several
approaches for information security such as cryptography and steganography. Cryptography
is the study of converting information from its comprehensible form into an
incomprehensible form. Steganography is the art or the science to hiding information such as
text, audio, video, etc. in another cover media such as text, audio, video , so that the existence
of the data is not obvious[1]. Digital watermarking is related to copyright protection and it
can be thought as an application of steganography. It marks the information about the carrier
as a watermark[2]. Recently, biotechnology has been applied in everything of human's life.
Using DNA as a carrier instead of other cover media (text, audio, video, etc.) to hide the
secret messages has become the more secure cover used. Other cover media may be distorted
and can be detected by humans, and the embedding capacity of these cover media is low and
it cannot hide a large amount of data inside it and the capacity of the DNA is very high so,
DNA steganography is used to overcome the challenge of embedding capacity[3]. The DNA
is structured as a double helix as shown in Fig. 1., that made up of four nucleotide bases:
(Adenine (A), Thymine (T), Cytosine (C), and Guanine (G)). Hence, DNA is viewed as the
sequence of base pairs: AAGTCGATCGATCATCGAT…etc
CHAPTER 3
SYSTEM REQUIREMNTS
OS Windows 10
RAM Minimum of 4GB
CHAPTER 4
SYSTEM DESIGN
4.1 BLOCK DIAGRAM
CHAPTER 5
IMPLEMENTATIONS
Implementing DNA-based steganography involves several key steps, from encoding data into
DNA sequences to synthesizing and retrieving the DNA. Below is a detailed guide to the
implementation process.
Step-by-Step Implementation
1. Data Encoding
2. Mapping Binary to DNA: Map the binary data to DNA nucleotides (A, T, C, G).
o Common Mapping Schemes:
00 → A
01 → T
10 → C
11 → G
o Example: 01001000 → TACA
4. Concatenation and Padding: Combine the encoded segments and add necessary
padding for sequence uniformity.
2. DNA Synthesis
1. Sequence Design: Design the DNA sequence with the encoded data.
o Ensure sequences are biologically viable and do not form unintended secondary
structures.
1. Vector Preparation: Insert the synthetic DNA into a vector (e.g., plasmid) for
stability and ease of handling.
o Vectors are circular DNA molecules that can replicate within host cells.
2. Transformation: Introduce the vector into a host organism (e.g., bacteria) via
transformation methods (e.g., heat shock, electroporation).
o The host organism replicates the DNA, effectively storing the encoded data.
4. Data Retrieval
2. DNA Sequencing: Sequence the extracted DNA to retrieve the encoded data.
o Methods: Next-Generation Sequencing (NGS), Sanger sequencing.
3. Data Decoding
4. Sequence Analysis: Analyze the sequenced data to identify the encoded segments.
o Use bioinformatics tools to align and identify the correct sequences.
7. Binary to Original Data: Convert the binary data back to its original form.
o Example: 01001000 → ASCII → "H"
CHAPTER 6
RESULTS
Increased Data Density: By encoding data within the four nucleotide bases (A, T, C,
G) of DNA, significantly higher data densities are achieved compared to traditional
storage media. For instance, a single gram of DNA can theoretically store up to 215
petabytes (215 million gigabytes) of data.
Optimized Encoding Algorithms: Advanced algorithms and novel encoding schemes,
such as Huffman coding and modular arithmetic, have been developed to maximize
storage capacity and efficiency, ensuring minimal redundancy and high data density.
CHAPTER 7
CONCLUSION
enhance its practicality and reliability, potentially revolutionizing data security and
storage solutions across various sectors.
CHAPTER 8
FUTURE ENHANCEMENT
3. Security Improvements
REFERENCES
1. https://microbialcellfactories.biomedcentral.com/articles/10.1186/s12934-020-01387-
0
2. https://www.researchgate.net/publication/
282317899_DNA_Based_Steganography_Survey_and_Analysis_for_Parameters_Op
timization
3. https://thesai.org/Downloads/Volume11No1/Paper_79-
A_Survey_on_Cloud_Data_Security.pdf
4. https://link.springer.com/article/10.1007/s11042-018-6951-z
5. https://builtin.com/cybersecurity/steganography
APPENDIX
Source code:
import tkinter as tk
import random
# Encryption functions
dna_key = {
"A": "00",
"C": "01",
"G": "10",
"T": "11"
def randomDna(len_dna):
return input_dna
def msg_to_bin(message):
message_ascii_bin_list = []
message_ascii = ord(char)
message_ascii_bin_list.append(message_ascii_bin_temp)
message_ascii_bin_string = "".join(message_ascii_bin_list)
return message_ascii_bin_string
def dna_to_bin(dna):
dna_ascii_bin_list = []
for nt in dna:
dna_ascii_bin_list.append(dna_key[nt])
dna_ascii_bin_string = "".join(dna_ascii_bin_list)
return dna_ascii_bin_string
def encode_message(message):
message_bin = msg_to_bin(message)
len_message_bin = len(message_bin)
random_dna = randomDna(len_random_dna)
dna_bin = dna_to_bin(random_dna)
encoded_dna_list = []
nt = encoded_bin[i:i + 2]
if nt == "00":
encoded_dna_list.append("A")
elif nt == "01":
encoded_dna_list.append("C")
elif nt == "10":
encoded_dna_list.append("G")
elif nt == "11":
encoded_dna_list.append("T")
encoded_dna = "".join(encoded_dna_list)
file.write(encoded_dna)
encoded_dna_bin_list = []
for nt in encoded_dna:
encoded_dna_bin_list.append(dna_key[nt])
encoded_dna_bin = "".join(encoded_dna_bin_list)
decoded_message_list = []
byte = decoded_bin[i:i + 8]
decoded_message_list.append(chr(int(byte, 2)))
decoded_message = "".join(decoded_message_list).rstrip('\x00')
return decoded_message
# Tkinter GUI
def show_encoded_message():
user_input = entry.get()
if not user_input:
return
def show_decoded_message():
user_encoded_dna = entry_encoded_dna.get()
if user_encoded_dna:
if user_encoded_dna == correct_encoded_dna:
try:
except Exception as e:
else:
else:
root = tk.Tk()
root.title("DNA Encoder/Decoder")
window_width = 400
window_height = 400
screen_width = root.winfo_screenwidth()
screen_height = root.winfo_screenheight()
root.geometry(f'{window_width}x{window_height}+{position_right}+
{position_top}')
root.configure(bg='#BFEFFF')
# Apply a style
style = ttk.Style(root)
style.theme_use('clam')
label.pack(pady=10)
entry.pack(pady=10)
button_encode.pack(pady=10)
label_encoded_dna.pack(pady=10)
entry_encoded_dna.pack(pady=10)
button_decode.pack(pady=10)
encoded_dna = ""
correct_encoded_dna = ""
dna_bin = ""
root.mainloop()