0% found this document useful (0 votes)
14 views27 pages

Thisisit

DNA-based steganography is a novel method of hiding information within DNA sequences, leveraging its high information density and stability for secure communication and data storage. The document outlines existing systems, proposes enhancements, and details the implementation process, including encoding, synthesis, and retrieval of data. Future advancements aim to improve encoding algorithms, increase storage capacity, and enhance security through cryptographic techniques.

Uploaded by

Adithi B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views27 pages

Thisisit

DNA-based steganography is a novel method of hiding information within DNA sequences, leveraging its high information density and stability for secure communication and data storage. The document outlines existing systems, proposes enhancements, and details the implementation process, including encoding, synthesis, and retrieval of data. Future advancements aim to improve encoding algorithms, increase storage capacity, and enhance security through cryptographic techniques.

Uploaded by

Adithi B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

DNA Based Steganography

CHAPTER 1

INTRODUCTION
In the digital age, the need for secure communication and data protection has led to the
development of numerous steganographic techniques. Steganography, the art of hiding
information within other non-suspicious data, has traditionally been applied to digital media
such as images, audio, and video files. However, with the rapid advancements in
biotechnology and the expanding capabilities of DNA synthesis and sequencing, a novel and
intriguing method has emerged: DNA-based steganography.
DNA (deoxyribonucleic acid) is the fundamental molecule that carries the genetic
instructions for all known living organisms and many viruses. Its unique structure, composed
of four nucleotide bases—adenine (A), cytosine (C), guanine (G), and thymine (T)—forms
the basis for its ability to store vast amounts of information in a highly compact and efficient
manner. Each sequence of nucleotides encodes specific genetic information, which can be
read and interpreted by biological systems. This intrinsic information density and complexity
make DNA an ideal medium for steganographic purposes.

1.1 Existing System

Existing systems in DNA-based steganography involve several methodologies and


technologies that enable the encoding, synthesis, storage, and decoding of hidden messages
within DNA sequences. These systems leverage both biological and computational
techniques to achieve secure and efficient data hiding. Here is an overview of the key
components and examples of existing systems in DNA-based steganography

Drawbacks of Existing System

 Cost and Accessibility

 Error Rates

 Ethical and Security Issues:

Dept. of ECE, AMCEC 2023-24 1


DNA Based Steganography

1.2 Propose system

• The purpose of a DNA-based steganography system is to provide a secure and efficient


method for embedding hidden messages within DNA sequences. This approach
leverages the unique properties of DNA, such as its high information density, stability,
and biocompatibility, to achieve various goals related to data security, storage, and
communication. Here are the primary purposes and objectives of a DNA-based
steganography system:

1. Secure Communication

2. Data Storage

3. Data Authentication and Verification

4. Biological Computing and Data Integration

5. Enhancing Data Security

6. Research and Development

1.3 Modules:

Encryption Module:

In this module user have access to enter his message and encrypt it into a secured DNA
sequence.

Decryption Module:
In this module the user can enter the secured DNA sequence and decrypt it to his original
message.

1.4 Objectives

The primary aim of this project is to analyze the “Pima Indian Diabetes Dataset” and
“Cleveland Heart Disease Dataset” and use Support Vector Machine, Naïve Bayes, and K-
Nearest Neighbors for prediction. The secondary aim is to develop a web
application that allows users to predict heart disease and diabetes utilizing the prediction
engine.

Dept. of ECE, AMCEC 2023-24 2


DNA Based Steganography

1.5 Scope of the Project

The scope of DNA-based steganography encompasses the development, implementation,


and application of a novel method for secure data storage and transmission using DNA
sequences. This technique leverages DNA's inherent high information density and
stability to encode hidden messages within biological or synthetic DNA, offering a
unique solution for data security. The project involves creating robust algorithms for
encoding binary data into DNA sequences and decoding it back, ensuring accuracy and
efficiency. It includes synthesizing DNA strands with embedded information, integrating
error-correction and encryption mechanisms to maintain data integrity and
confidentiality. The scope extends to exploring various storage and transmission methods,
such as synthetic DNA chips and genomic embedding in living organisms. Additionally,
the project aims to demonstrate practical applications like secure communication
channels, digital watermarking, and long-term archival of sensitive information. Key
deliverables include technical documentation, prototype DNA sequences, software tools
for encoding and decoding, and proof-of-concept applications. The project will be
conducted within specific boundaries, including technical constraints, ethical
considerations, and budget limitations. Future directions may involve scaling the system
for larger datasets, reducing costs, and integrating with emerging technologies. By
addressing these aspects, the project seeks to advance the field of data security and offer
innovative solutions for storing and transmitting information covertly and securely.

Dept. of ECE, AMCEC 2023-24 3


DNA Based Steganography

CHAPTER 2

LITERATURE SURVEY
As access to different types of data through the Internet become more popular, more
common, and faster, and the internet is an insecure channel, sending data in a secure and safe
channel becomes more challenging. Accordingly, sensitive information or secret messages
must be protected so that only the authorized receiver can retrieve it. There are several
approaches for information security such as cryptography and steganography. Cryptography
is the study of converting information from its comprehensible form into an
incomprehensible form. Steganography is the art or the science to hiding information such as
text, audio, video, etc. in another cover media such as text, audio, video , so that the existence
of the data is not obvious[1]. Digital watermarking is related to copyright protection and it
can be thought as an application of steganography. It marks the information about the carrier
as a watermark[2]. Recently, biotechnology has been applied in everything of human's life.
Using DNA as a carrier instead of other cover media (text, audio, video, etc.) to hide the
secret messages has become the more secure cover used. Other cover media may be distorted
and can be detected by humans, and the embedding capacity of these cover media is low and
it cannot hide a large amount of data inside it and the capacity of the DNA is very high so,
DNA steganography is used to overcome the challenge of embedding capacity[3]. The DNA
is structured as a double helix as shown in Fig. 1., that made up of four nucleotide bases:
(Adenine (A), Thymine (T), Cytosine (C), and Guanine (G)). Hence, DNA is viewed as the
sequence of base pairs: AAGTCGATCGATCATCGAT…etc

Dept. of ECE, AMCEC 2023-24 4


DNA Based Steganography

CHAPTER 3

SYSTEM REQUIREMNTS

3.1 Hardware Requirements:

OS Windows 10
RAM Minimum of 4GB

3.2 Software Requirements:

Basic Text-Editor Pycharm


Tkinter Framework UI Development

Dept. of ECE, AMCEC 2023-24 5


DNA Based Steganography

CHAPTER 4

SYSTEM DESIGN
4.1 BLOCK DIAGRAM

1. Encryption (Fig 4.1)


2. Decryption (Fig 4.2)

Fig 4.1 Block diagram of encryption

Dept. of ECE, AMCEC 2023-24 6


DNA Based Steganography

Fig 4.2 Block diagram of decryption

Fig 4.3 Block diagram

Dept. of ECE, AMCEC 2023-24 7


DNA Based Steganography

CHAPTER 5

IMPLEMENTATIONS

Implementing DNA-based steganography involves several key steps, from encoding data into
DNA sequences to synthesizing and retrieving the DNA. Below is a detailed guide to the
implementation process.

Step-by-Step Implementation
1. Data Encoding

1. Binary Conversion: Convert the data to be hidden into a binary format.


o Example: "Hello" → ASCII values → Binary: 01001000 01100101 01101100
01101100 01101111

2. Mapping Binary to DNA: Map the binary data to DNA nucleotides (A, T, C, G).
o Common Mapping Schemes:
 00 → A
 01 → T
 10 → C
 11 → G
o Example: 01001000 → TACA

3. Error Correction Encoding: Incorporate error-correction codes (e.g., Reed-


Solomon) to ensure data integrity.
o Add redundant bits for error correction.

4. Concatenation and Padding: Combine the encoded segments and add necessary
padding for sequence uniformity.

2. DNA Synthesis

1. Sequence Design: Design the DNA sequence with the encoded data.
o Ensure sequences are biologically viable and do not form unintended secondary
structures.

2. Synthesis: Use DNA synthesis techniques to create physical DNA strands.

Dept. of ECE, AMCEC 2023-24 8


DNA Based Steganography

o Methods: Phosphoramidite synthesis, enzymatic synthesis.


o Providers: Companies like Twist Bioscience, IDT, etc., offer custom DNA synthesis
services.

3. Embedding in Host DNA

1. Vector Preparation: Insert the synthetic DNA into a vector (e.g., plasmid) for
stability and ease of handling.
o Vectors are circular DNA molecules that can replicate within host cells.

2. Transformation: Introduce the vector into a host organism (e.g., bacteria) via
transformation methods (e.g., heat shock, electroporation).
o The host organism replicates the DNA, effectively storing the encoded data.

4. Data Retrieval

1. DNA Extraction: Extract the DNA from the host organism.


o Use standard DNA extraction protocols (e.g., phenol-chloroform extraction,
commercial DNA extraction kits).

2. DNA Sequencing: Sequence the extracted DNA to retrieve the encoded data.
o Methods: Next-Generation Sequencing (NGS), Sanger sequencing.

3. Data Decoding
4. Sequence Analysis: Analyze the sequenced data to identify the encoded segments.
o Use bioinformatics tools to align and identify the correct sequences.

5. Error Correction: Apply error-correction algorithms to fix any sequencing errors.


6. DNA to Binary: Convert the DNA sequences back to binary using the predefined
mapping scheme.
o Example: TACA → 01001000

7. Binary to Original Data: Convert the binary data back to its original form.
o Example: 01001000 → ASCII → "H"

Dept. of ECE, AMCEC 2023-24 9


DNA Based Steganography

CHAPTER 6

RESULTS

DNA-based steganography leverages the complex structure of DNA to embed secret


information within biological sequences. The results of advancements in this field can be
categorized into several key areas, reflecting the multifaceted benefits and applications of this
technology.

1. Efficiency and Storage Capacity

 Increased Data Density: By encoding data within the four nucleotide bases (A, T, C,
G) of DNA, significantly higher data densities are achieved compared to traditional
storage media. For instance, a single gram of DNA can theoretically store up to 215
petabytes (215 million gigabytes) of data.
 Optimized Encoding Algorithms: Advanced algorithms and novel encoding schemes,
such as Huffman coding and modular arithmetic, have been developed to maximize
storage capacity and efficiency, ensuring minimal redundancy and high data density.

2. Robustness and Error Correction

 Enhanced Error Correction: Robust error-correction mechanisms, such as Reed-


Solomon codes and parity checks, have been integrated to mitigate errors arising
from DNA synthesis and sequencing processes. These methods ensure data integrity
over long periods.
 Stability and Longevity: DNA’s inherent stability, especially when stored under
optimal conditions (e.g., in a cool, dry, and dark environment), ensures the longevity
of stored data, potentially lasting thousands of years.

Dept. of ECE, AMCEC 2023-24 10


DNA Based Steganography

3. Security and Stealth

 Cryptographic Integration: Combining DNA steganography with cryptographic


techniques (e.g., symmetric and asymmetric encryption) enhances data security,
making it extremely difficult for unauthorized parties to detect or decode the hidden
information.
 Stealth Techniques: Methods such as random insertion, substitution, and the use of
biological markers have been developed to further obscure the presence of encoded
data, rendering it nearly undetectable even to sophisticated sequencing
technologies.

4. Automation and Scalability

 Automated Synthesis and Sequencing: The development of automated systems for


DNA synthesis and sequencing has streamlined the process, reducing costs and
increasing throughput. This automation is crucial for scaling the technology for
widespread use.
 Scalability: Scalable solutions, including high-throughput DNA synthesis platforms
and parallel sequencing techniques, enable the handling of large volumes of data,
making DNA steganography practical for real-world applications.

5. Bioinformatics and Computational Tools

 Specialized Software: Advanced bioinformatics tools and software, such as CRISPR-


based editing and machine learning algorithms, facilitate the design, analysis, and
interpretation of DNA-encoded data. These tools optimize encoding strategies and
enhance error correction.
 Machine Learning Applications: Machine learning models are employed to predict
optimal encoding patterns, identify potential errors, and improve overall efficiency
and accuracy in data encoding and retrieval processes.

Dept. of ECE, AMCEC 2023-24 11


DNA Based Steganography

Fig 6.1 Landing page

Dept. of ECE, AMCEC 2023-24 12


DNA Based Steganography

Fig 6.2 Encryption of message

Dept. of ECE, AMCEC 2023-24 13


DNA Based Steganography

Fig 6.3 Decryption of message

Dept. of ECE, AMCEC 2023-24 14


DNA Based Steganography

CHAPTER 7

CONCLUSION

DNA-based steganography represents a groundbreaking intersection of biotechnology and


information security, offering a novel and highly sophisticated method for data concealment.
As research and development continue to advance in this field, several key enhancements are
anticipated:

1. Improved Encoding and Error-Correction: Developing more advanced algorithms


and robust error-correction mechanisms will ensure efficient and reliable data storage
and retrieval.
2. Increased Storage Capacity: Optimizing codon usage and exploring novel encoding
schemes will maximize the potential of DNA’s storage capacity.
3. Enhanced Security: Integrating cryptographic techniques and developing stealth
methods will make DNA-encoded data more secure and less detectable.
4. Automation and Scalability: Automation and scalable solutions will facilitate
practical applications, making DNA steganography more accessible and efficient.
5. Advanced Bioinformatics: Specialized software tools and machine learning will aid
in optimizing and managing DNA-encoded data.
6. Synthetic Biology Integration: Leveraging synthetic biology and living organisms
for dynamic and custom storage solutions opens new possibilities for data
concealment.
7. Environmental Stability: Innovations to protect DNA from environmental factors
will ensure long-term data preservation.
8. Interdisciplinary Collaboration: Promoting interdisciplinary research and education
will drive innovation and expertise in the field.
9. Ethical and Legal Frameworks: Developing guidelines and regulations will ensure
the responsible use of DNA-based steganography.
10. Commercial Applications: Exploring commercial viability and industry partnerships
will expand the practical use of DNA-based data concealment.

In conclusion, DNA-based steganography holds immense promise for secure,


efficient, and durable data storage. Continued advancements in this field will

Dept. of ECE, AMCEC 2023-24 15


DNA Based Steganography

enhance its practicality and reliability, potentially revolutionizing data security and
storage solutions across various sectors.

CHAPTER 8

FUTURE ENHANCEMENT

Improved Encoding Algorithms

 Advanced Algorithms: Developing more sophisticated algorithms for encoding and


decoding data into DNA sequences can increase the efficiency and security of the
process.
 Error-Correction Mechanisms: Integrating robust error-correction techniques to
handle mutations or sequencing errors, ensuring data integrity over time.

2. Enhanced Storage Capacity

 Optimized Codon Usage: Using optimized codon sequences to maximize storage


capacity and reduce redundancy.
 Novel Encoding Schemes: Exploring new methods of encoding that take advantage
of the full potential of DNA's four-base (A, T, C, G) structure.

3. Security Improvements

 Cryptographic Integration: Combining DNA-based steganography with advanced


cryptographic techniques to enhance security and make it resistant to attacks.
 Stealth Techniques: Developing methods to make the presence of encoded data less
detectable, even by advanced sequencing technologies.

Dept. of ECE, AMCEC 2023-24 16


DNA Based Steganography

REFERENCES
1. https://microbialcellfactories.biomedcentral.com/articles/10.1186/s12934-020-01387-
0

2. https://www.researchgate.net/publication/
282317899_DNA_Based_Steganography_Survey_and_Analysis_for_Parameters_Op
timization

3. https://thesai.org/Downloads/Volume11No1/Paper_79-
A_Survey_on_Cloud_Data_Security.pdf

4. https://link.springer.com/article/10.1007/s11042-018-6951-z

5. https://builtin.com/cybersecurity/steganography

Dept. of ECE, AMCEC 2023-24 17


DNA Based Steganography

APPENDIX

Source code:

import tkinter as tk

from tkinter import ttk, messagebox

import random

# Encryption functions

dna_key = {

"A": "00",

"C": "01",

"G": "10",

"T": "11"

reverse_dna_key = {v: k for k, v in dna_key.items()}

# Generate random DNA

def randomDna(len_dna):

nucleotides = ["A", "T", "G", "C"]

Dept. of ECE, AMCEC 2023-24 18


DNA Based Steganography

input_dna = "".join([random.choice(nucleotides) for _ in range(len_dna)])

return input_dna

# Convert message to binary

def msg_to_bin(message):

message_ascii_bin_list = []

for char in message:

message_ascii = ord(char)

message_ascii_bin_temp = bin(message_ascii)[2:].zfill(8) # Convert to 8-


bit binary representation

message_ascii_bin_list.append(message_ascii_bin_temp)

message_ascii_bin_string = "".join(message_ascii_bin_list)

return message_ascii_bin_string

# Convert DNA to binary

def dna_to_bin(dna):

dna_ascii_bin_list = []

for nt in dna:

dna_ascii_bin_list.append(dna_key[nt])

dna_ascii_bin_string = "".join(dna_ascii_bin_list)

return dna_ascii_bin_string

Dept. of ECE, AMCEC 2023-24 19


DNA Based Steganography

# XOR two binary strings

def xor_bin(bin_str1, bin_str2):

return ''.join(str(int(b1) ^ int(b2)) for b1, b2 in zip(bin_str1, bin_str2))

def encode_message(message):

message_bin = msg_to_bin(message)

len_message_bin = len(message_bin)

len_random_dna = len_message_bin // 2 * 3 # Ensure we have enough DNA


to cover the message length

random_dna = randomDna(len_random_dna)

dna_bin = dna_to_bin(random_dna)

# Perform XOR operation

encoded_bin = xor_bin(message_bin.ljust(len(dna_bin), '0'),


dna_bin[:len(message_bin)])

# Convert encoded binary to DNA

encoded_dna_list = []

for i in range(0, len(encoded_bin), 2):

nt = encoded_bin[i:i + 2]

if nt == "00":

Dept. of ECE, AMCEC 2023-24 20


DNA Based Steganography

encoded_dna_list.append("A")

elif nt == "01":

encoded_dna_list.append("C")

elif nt == "10":

encoded_dna_list.append("G")

elif nt == "11":

encoded_dna_list.append("T")

encoded_dna = "".join(encoded_dna_list)

with open("Encoded_DNA.txt", "w") as file:

file.write(encoded_dna)

return encoded_dna, dna_bin[:len(message_bin)]

def decode_message(encoded_dna, dna_bin):

encoded_dna_bin_list = []

for nt in encoded_dna:

encoded_dna_bin_list.append(dna_key[nt])

encoded_dna_bin = "".join(encoded_dna_bin_list)

Dept. of ECE, AMCEC 2023-24 21


DNA Based Steganography

# Perform XOR operation

decoded_bin = xor_bin(encoded_dna_bin[:len(dna_bin)], dna_bin)

decoded_message_list = []

for i in range(0, len(decoded_bin), 8):

byte = decoded_bin[i:i + 8]

decoded_message_list.append(chr(int(byte, 2)))

decoded_message = "".join(decoded_message_list).rstrip('\x00')

return decoded_message

# Tkinter GUI

def show_encoded_message():

user_input = entry.get()

if not user_input:

messagebox.showerror("Input Error", "Please enter a message to


encode.")

return

global encoded_dna, correct_encoded_dna, dna_bin

encoded_dna, dna_bin = encode_message(user_input)

correct_encoded_dna = encoded_dna # Store the correct encoded DNA


sequence

Dept. of ECE, AMCEC 2023-24 22


DNA Based Steganography

messagebox.showinfo("Encoded DNA", f"Encoded DNA: {encoded_dna}")

def show_decoded_message():

user_encoded_dna = entry_encoded_dna.get()

if user_encoded_dna:

if user_encoded_dna == correct_encoded_dna:

try:

decoded_message = decode_message(user_encoded_dna, dna_bin)

messagebox.showinfo("Decoded Message", f"Decoded Message:


{decoded_message}")

except Exception as e:

messagebox.showerror("Decode Error", f"An error occurred: {e}")

else:

messagebox.showerror("Access Denied", "The entered encoded DNA


sequence is incorrect.")

else:

messagebox.showerror("Decode Error", "Please enter an encoded DNA


sequence to decode.")

# Create the main window

root = tk.Tk()

root.title("DNA Encoder/Decoder")

Dept. of ECE, AMCEC 2023-24 23


DNA Based Steganography

# Set window size and center it

window_width = 400

window_height = 400

screen_width = root.winfo_screenwidth()

screen_height = root.winfo_screenheight()

position_top = int(screen_height / 2 - window_height / 2)

position_right = int(screen_width / 2 - window_width / 2)

root.geometry(f'{window_width}x{window_height}+{position_right}+
{position_top}')

root.configure(bg='#BFEFFF')

# Apply a style

style = ttk.Style(root)

style.theme_use('clam')

style.configure('TButton', font=('Helvetica', 12), padding=10,


background="#7fb3d5", foreground="#ffffff")

style.configure('TLabel', font=('Helvetica', 12), background="#f0f0f0")

style.configure('TEntry', font=('Helvetica', 12), background="#f0f0f0")

# Create a frame for the content

frame = ttk.Frame(root, padding="20", relief="raised", borderwidth=2)

Dept. of ECE, AMCEC 2023-24 24


DNA Based Steganography

frame.place(relx=0.5, rely=0.5, anchor='center')

# Create a label for entering a message

label = ttk.Label(frame, text="Enter a message:")

label.pack(pady=10)

# Create an entry widget for user input

entry = ttk.Entry(frame, width=30)

entry.pack(pady=10)

# Create a button to display the encoded input

button_encode = ttk.Button(frame, text="Show Encoded DNA",


command=show_encoded_message)

button_encode.pack(pady=10)

# Create a label for entering an encoded DNA sequence

label_encoded_dna = ttk.Label(frame, text="Enter an encoded DNA


sequence:")

label_encoded_dna.pack(pady=10)

# Create an entry widget for encoded DNA input

entry_encoded_dna = ttk.Entry(frame, width=30)

Dept. of ECE, AMCEC 2023-24 25


DNA Based Steganography

entry_encoded_dna.pack(pady=10)

# Create a button to display the decoded input

button_decode = ttk.Button(frame, text="Show Decoded Message",


command=show_decoded_message)

button_decode.pack(pady=10)

# Initialize the global encoded_dna and correct_encoded_dna variables

encoded_dna = ""

correct_encoded_dna = ""

dna_bin = ""

# Run the main loop

root.mainloop()

Dept. of ECE, AMCEC 2023-24 26


DNA Based Steganography

Dept. of ECE, AMCEC 2023-24 27

You might also like