DNA Sequencing
DNA Sequencing
(Assignment)
Submitted to: Dr. Hafiz Abdullah Shakir
Submitted by: Nabia Fiaz (MS Roll no. 02)
Fiza Adil (MS Roll no. 04)
Muqadas Sardar (MS Roll no. 05)
Subject: Advanced Analytical techniques
Session: 2022-2024
Table of contents:
DNA sequencing
Purpose of DNA sequencing
General steps in all DNA sequencing techniques
1. Reactions
Base specific reactions
Enzymatic extensions
2. Separation
Slab gel sequencing
CE sequencing
3. Detection
Radioactive
Fluorescence
DNA sequencing methods
Basic DNA sequencing
o Maxam-Gilbert sequencing
o Sanger sequencing
Next generation DNA sequencing
o Hybridization
o Pyrosequencing
o Nanopore sequencing
Maxam-Gilbert sequencing:
o Procedure
o Advantages
o Disadvantages
Sanger sequencing
o History
o Principle
o Material
o Methodology
o Advantages
o Disadvantages
o Automation of sequencing
Next generation sequencing:
1. Hybridization:
o Definition and explanation
o Principle
o Working
o Applications
2. Pyrosequencing
o Definition
o Steps of pyrosequencing
o 454 Approach steps
3. Nanopore sequencing:
o Introduction
o Theory of nanopore sequencing
o Types of nanopore sequencing
Biological membrane systems
Solid state sensor technology
o Applications
Disadvantages of next generation sequencing
References
DNA Sequencing
DNA Sequencing:
DNA Sequencing is the technique that determines the exact order of the four nucleotides
bases such as adenine, thymine, cytosine, and guanine that make up the DNA molecule (Mardis,
2017).
DNA double helix contains four bases called nitrogenous bases bond each other through
base pairs. Adenine (A) pairs with thymine (T) and cytosine (C) pairs with guanine (G). Th
human genome has 3 billion copies of these base pairs. These base pairs contain all the
information to maintain human body. So, these base pairs one can easily determine the processes
of transcription, translation and also used for sequencing of DNA molecules, providing basics to
sequencing methods (Mardis, 2017).
Purpose:
Data obtained from DNA sequences has become essential for research purposes and in
many fields of science such as:
Medical diagnosis
Biotechnology
Forensic biology
Virology
Biological systematics
Scientists can utilize sequenced data to find all the genes and regulatory systems present
in DNA molecule. DNA sequencing also seen its application in medicine field like diagnosis and
treatment of diseases. DNA Sequencing also has the ability to reform food quality and safety. It
also performs its function in maintaining sustainable agriculture for plant, animal, and public
health b by animal breeding and protecting from disease outbreaks (Mardis, 2017).
The rapid speed of sequencing attained with modern DNA sequencing technology has
been instrumental in the sequencing of complete DNA, or genomes of numerous types and
species of life, including the human genome and other complete DNA sequences of many animal,
plant and microbial species. DNA sequencing techniques are playing key roles in many applied
fields and many science fields are taking advantages of these techniques like archaeology,
anthropology, genetics, biotechnology, molecular biology, forensic sciences, many more.
DNA sequencing is also used in many new discoveries to promote revolutionizing (França,
Carrilho, & Kist, 2002).
Biotecnology
Genetics
DNA Forensic
Sequencing science
Medical Molecular
diagnosis Biology
2. Separation:
The separation of DNA fragments is done by electrophoresis. The electrophoresis is a
process use to separate molecules i.e nucleic acids and proteins on the bases of charge and size
by using specific gel.
a) SLAB-GEL Sequencing:
The novel DNA sequencing techniques used standard slab PAGE instrument for separation
and visualization of the end products of the DNA sequencing reactions. The only PAGE gel
electrophoresis is not actual experiment. Electrophoresis is totally the separation of molecules
having different charges but all DNA molecules have same negative charge. These DNA
fragments are separated on the basis of size or length of fragments of varying lengths.
The polyacrylamide gel is used which creates an excess of “pores” having different sizes.
DNA molecules having large sizes become tangled in these pores. The smaller fragments move
quickly through these pores and move towards bottom. The larger fragments appear at the top of
gel machine. The fragments are obtained in four different lanes. Radioactively labeled DNA
fragments are used. The composition of these gels is mostly 6% acrylamide in 1 × TBE (tris-
borate-EDTA) buffer.
b) CE Sequencing
The advance DNA sequencing technique is Capillary Electrophoresis sequencing. This
sequencing system was developed after sab-gel sequencing.
The CE system permitted for increased speed, easy to use, and better accuracy. Sequence of all
DNA fragments is obtained in signal lane. This system also uses fluorescent dyes incorporated in
DNA fragments instead of radioactively labelled fragments.
CE separations offer several advantages over slab-gel-based sequencing systems. First,
capillary systems have dynamic coatings and can be used several times. Secondly, there is no
need to synthesize gel which must be poured, as gels are difficult to synthesize without bubbling
also time consuming. Thirdly, the stretchy capillaries are easily attached to a microtiter plate.
Finally, multi-capillary systems create significantly increase the throughput of a sequencing
system (Nunnally & Brain, 2005).
3. Detection:
a) Radioactive:
Previously, detection was achieved by radioactive labels of different elements such as 32P or
35
S. Radioactive labels as tags were tremendously effective and beneficial for detection of DNA
sequencing for reaction products. The labeled reagents such as primer or nucleotide are not
dissimilar in length or size or shape than the unlabeled one, so the DNA polymerases show no
preference or any kind of fidelity reductions.
However, radioactive elements are very dangerous to deal with them and radioactive gels must
be directed to the x-ray film. This step takes at least 24 to 36 h to create the bands and to collect
500 bases of sequencing DNA (Maxam & Gilbert, 1977).
b) Fluorescence:
The fluorescence sequencing system was first developed in Hood’s laboratory about in mid-
1980s. The system used different dyes; all dyes showed different emission maximum.
Fluorescein isothiocyanate
NBD-amino-hexanoic acid
Tetramethyl-rhodamine isothiocyanate
Texas Red
The use of fluorescence-based systems has removed radioactive labels from almost all types
of DNA sequencing. The reason behind this is increased safety, easily disposable, multiplex
ability and acquiring of real-time data. The most important parameter of all is multiplex ability of
these fluorescent dyes.
Only one well with four dyes is used instead of four lanes as in PAGE gel. Eventually, real-
time is obtained without using X-ray film and off-line data collection (Rosenthal et al., 1990).
DNA Sequencing Methods:
Basically, there are three main types of DNA sequencing:
1. Basic DNA sequencing
2. Advanced DNA sequencing
3. Next Generation DNA sequencing
1. Preparation of 2. Modification of
3. Cleavage of 4. Electrophoresis
radioactively end nitorgenous bases
DNA fragment at and reading of
labeled ss-DNA of that particular
modified bases. sequence
Fragement. DNA fragement
Procedure:
This method uses double stranded DNA molecule as a starting material which needs to be
subsequent changes. The overall procedure requires the following steps:
1) Preparation of radioactively 5’-end labelled ss-DNA
2) Modification of nitrogenous bases
3) Cleavage at modified base position
4) Separation of cleavage DNA fragments by Electrophoresis plus autoradiography
5) Analysis/ Reading of DNA sequence
1) Preparation of radioactively 5’-end labeled ss-DNA:
This method requires radioactively labelling of 5’-end of DNA with a radioactive element i.e
32
P. For this purpose, multiple reactions are performed on double stranded DNA molecule.
Firstly, dephosphorylation of ds-DNA is done to remove naturally incorporated P at
5’end. This reaction is favored in presence of Alkaline phosphatase enzyme.
Secondly, radioactive element ¿- 32P)-dATP are incorporated with 5’end of both DNA
strand (from which the normal P were removed). This reaction is occurred in the presence
of enzyme polynucleotide kinase taken from E.coli (Verma, Kulshrestha, & Puri, 2017).
Now the two strands of DNA are separated by using dimethyl sulphoxide and then heated
at 90 C.
These strands are further subjected to gel electrophoresis for separation and isolation. The
separation is based on heavy and lighter strands as purines (Guanine & adenine) are
heavier than pyrimidines (Thymine & cytosine).
The single strand of DNA labelled with γ - 32P is divided into separate samples ready to
treat with chemical reagents.
Figure 7: Synthesis of radioactively labeled single stranded DNA at 5' end γ - 32P
Figure 10: Separation of DNA fragments and reading of sequenced DNA using
polyacrylamide gel electrophoresis
5) Analysis/ Reading of DNA sequence:
The DNA sequence is read directly from the gel. First, the bands are visualized by using X-
ray film autoradiography.
Advantages of Maxam-Gilbert Method:
Advantages of Maxam and Gilbert method of DNA sequencing are as follows;
It is used to sequenced both single and double stranded DNA molecule.
DNA replication is not required so no need of DNA polymerases and dNTPs. There is
also no need of premature termination of DNA template. So, no problem with polymerase
to synthesize DNA.
Stretches of DNA fragments can be sequenced which is not possible with enzymatic
method.
Sequenced and purified DNA can be read directly without complementary the sequenced.
Homo polymeric DNA runs are sequenced as efficiently as heterogenous DNA
sequences.
It is also used to study foot-printing i.e DNA interactions with proteins.
Can be used to analyze the nucleic acid structure and epigenetic modifications the DNA.
Figure 11: Schematic diagram showing Maxam-Gilbert Chemical Degradation Method of DNA
sequencing
Figure 122: dNTP (with a free 3' -OH) is added to the DNA strand that is being synthesized
during synthesis. However, strand synthesis halts when a ddNTP is inserted because there is
no 3' -OH to create a phosphodiester bond with the subsequent dNTP (Verma, Kulshrestha, &
Puri, 2017)
Material
There are multiple individual parts to this method that all work together to carry out the
sequencing.
ssDNA template to be sequenced.
Primers.
Taq Polymerase for template strand amplification.
Buffer.
Deoxynucleotides (dNTPs).
Fluorescently labeled dideoxynucleotides (ddNTPs) (Verma, Kulshrestha, & Puri, 2017).
Primer designing: Primers can be designed in several ways, either manually or with the use of
software (Primer3, PrimerBLAST, etc.). Primer development requires attention to several
specific factors.
1. Primers need to be between 15 and 28 bases in length.
2. The recommended ratio of base constituents is 50-60% (G+C).
3. Primers should terminate (30) in a G or C, CG or GC; this eliminates end "breathing" and
improves priming efficiency.
4. Temperatures between 55 to 80 degrees Celsius are ideal (even 80 degrees is too high,
however many of the gaps were discovered to be GC-rich).
5. Primer's 3' ends shouldn't form base pairs, as this would lead to the preferred synthesis of
primer dimers rather than the intended result.
6. It is preferable to use primers that do not have self-complementarity (the ability to
produce secondary structures like hairpins) (Verma, Kulshrestha, & Puri, 2017).
Methodology
Recently extracted DNA can undergo direct sequencing without any additional processing. In
PCR, the denaturation, annealing, and extension steps can be repeated for 25-30 cycles.
The dsDNA is denatured, or broken apart, into two ssDNA molecules (ssDNA).
To obtain a continuous succession of synthesis products that reflect each, possible chain
termination location, four reactions involving template, polymerase, all four dNTPs (one
radioactively labeled), and primer are set up.
An attached primer spans the gap between two consecutive bases in the sequence.
A polymerase is added to a mixture that already contains dNTPs of four different types but only
one of the ddNTPs.
Additionally, one of the four ddNTPs is present in each reaction, with the amount present
representing the likelihood of incorporation.
Due to the absence of a 3' OH group in the integrated ddNTP, a phosphodiester bond cannot be
formed between the C3' OH of the sugar moiety and the C5' of the subsequent dNTPs, causing
the chain to break.
In each of the four reactions, several strands have been terminated, and their lengths range
widely.
Due to the presence of a single ddNTP species in each reaction, various fragments of varying
lengths are produced, with each fragment ending at a place in the template sequence
corresponding to one of the four nucleotides.
Single-nucleotide resolution is achieved by separating the four reactions separately on a large
denaturing polyacrylamide gel. After denaturing the DNA in each of the four reaction mixes,
the DNA is separated into single strands by electrophoresis in parallel lanes on a high-
resolution polyacrylamide gel.
The core sequence of the analyzed template can be read directly from the band pattern
throughout the four lanes. Find the first band at the bottom of the sequence to begin reading it.
Continue on to the subsequent longer section, and so on (Sanger, Nicklen, & Coulson, 1977;
Valencia et al., 2013; Bajpai, 2014 Verma, Kulshrestha, & Puri, 2017).
Figure 13: The Sanger sequencing method in 6 steps (adapted from Gauthier, 2008).
Advantages
The following are the characteristic pros of this method:
High single-pass accuracy.
Good ability to call repeats.
Long read lengths.
Relatively simple workflows and data analysis.
Easily accessible software. (Verma, Kulshrestha, & Puri, 2017).
Disadvantages
The following are the cons of this method:
Low throughput.
High cost of Sanger sample preparation.
Time-consuming.
Not suitable for sequencing large genomes.
High infrastructure cost. (Verma, Kulshrestha, & Puri, 2017).
Automation of sequencing
Sanger sequencing was first introduced, and further changes to the process for automating
sequencing were made. Applied BioSystems (now part of Life Technologies) developed the first
generation of "automated sequencers" in 1986, which featured automation of the gel
electrophoresis stages, detection of the fluorescent DNA band patterns, as well as analysis of
bands, marking the end of the period of manual sequencing. The use of dye-terminator
sequencing is an example of such progress. The primary benefit of this technology is the
increased precision and swiftness it provides. A machine designed by Leroy Hood and Mike
Hunkapiller and released by Applied Biosystems, Inc. (ABI) in 1987, the ABI model 370 was the
first automated DNA sequencing machine and could produce read lengths of up to 350 bp per
lane. In 1995, ABI also released the ABI PRISM 310 Genetic Analyzer, which was designed to
make pouring gels, installing the instrument, and loading samples easier and more streamlined.
The capillary sequencer was invented by Swerdlow and Gesteland; it uses polyacrylamide gel-
filled capillaries rather than slab gels. There are sequencers on the market right now that can
accommodate 4, 16, 48, 96, or 384 capillaries. The read length and sequencing speed both
improved as the number of capillaries expanded (França, Carrilho, & Kist, 2002).
Modern automated DNA sequencing takes use of capillary electrophoresis, which analyzes 8-96
sequencing operations simultaneously, and the newest generation of fluorescent dyes, which
produce strong and unique fluorescent emissions. Key advantages of "first automated DNA
sequencing" implementations over methods described for the original Sanger sequencing
included the removal of radioactivity use, "one-lane" sequencing, "one-tube" reactions,
automated base calling, and the replacement of slab gel technology with multi-capillary
electrophoresis with automatic, electrokinetic-injection lane loading (Valencia et al., 2013).
Figure 14: Processing protocols of Maxam-Gilbert and Sangers Sequencing. (Verma, Kulshrestha, &
Puri, 2017).
General applications:
It was thus feasible to prepare larger contiguous sequence information, based upon
overlapping sequences from the probe hybridization spots. Sequencing by hybridization has been
largely relegated to technologies that depend upon using specific probes to interrogate
sequences, such as in diagnostic applications for identifying disease related single nucleotide
polymorphisms (SNPs) in specific genes or identifying gross chromosome abnormalities.
Detection of mutations:
The spectrum of DNA samples which has subset of probes is complementary to the
reference sequence is recovered from a stock of all possible probes of a given length.
Hybridization of probes with the control and patient samples then takes place. In the event where
a mutation is present, there is a mismatch between the probe and the DNA sequence; hence, the
probes do not bind to the test sample. The overlapping of probes results in a low percentage of
positively expressed probes at the mutation sites, indicating that the test sample sequence differs
from the control DNA.
Pyrosequencing:
Definition:
Pyrosequencing is a method of DNA sequencing based on the "sequencing by synthesis"
principle, in which the sequencing is performed by detecting the nucleotide incorporated by a
DNA polymerase.
It was the first next generation technique to gain commercial introduction in 2004. This
method sequences short stretches of DNA. In pyrosequencing, whenever a nucleotide of DNA is
inserted by DNA polymerase, a pyrophosphate is released. This pyrophosphate initiates a series
of downstream reactions that at the end produces light by firefly enzyme luciferase. The more the
nucleotides inserted the more light is produced (up-to the point detector can detect it).
Steps of pyrosequencing:
1. Single stranded DNA is hybridized to a sequencing primer, PCR amplifies this
hybridized DNA. Then, it is incubated with DNA polymerase, ATP sulfurylase,
luciferase and apyrase enzymes, while substrates which are utilized in this technique are
adenosine 5’-phosphosulfate and luciferin.
2. One of the four deoxy-nucleotide tri-phosphate (dNTP) is added to initiate the next step.
DNA polymerase incorporates dNTPs into the template DNA if it is complementary. The
amount of dNTPs added equal to the release of pyrophosphates (PPi), if the incorporation
occurred at all.
3. The enzyme ATP sulfurylase converts PPi into ATP with the help of substrate adenosine
5’-phosphosulfate. This ATP starts converting luciferin into oxy-luciferin that produces
visible light. The visible light is proportional to quantity of ATP. A charged coupled
device (CCD) camera detects the visible light produced in luciferin mediated reaction and
a computer program can analyze this detected amount of light. Thus, the more the
number of nucleotides incorporated, the more the light signals.
4. The degradation of nucleotides is essential for continuing the sequences process. To
accomplish it apyrase was added in the first step, it removes all dNTPs from the solution.
5. Now if more nucleotides are added it can initiate a new cycle.
This sequencing technique is advanced for whole genome sequencing and it is considered one of
the fastest sequencing methods (Bajpai, 2014).
Nanopore sequencing:
Nano pore is named so because it is a small hole of diameter of nanometers. Certain
cellular proteins penetrate across the membrane are called transmembrane proteins that act as
nanopore. Nanopores are also cut as a larger holes (tens of nanometer wide) in a silicon piece
and then the hole is gradually filled by use of electron beam methods and consequently small
holes (nanopore) is left.