Dna Sequencing
Dna Sequencing
DNA sequencing is the process of determining the sequence of nucleotides (As, Ts, Cs,
and Gs) in a piece of DNA.
In Sanger sequencing, the target DNA is copied many times, making fragments of
different lengths. Fluorescent “chain terminator” nucleotides mark the ends of the
fragments and allow the sequence to be determined.
What is sequencing?
You may have heard of genomes being sequenced. For instance, the human genome was
completed in 2003, after a many-year, international effort. But what does it mean to sequence a
genome, or even a small fragment of DNA?
DNA sequencing is the process of determining the sequence of nucleotide bases (As, Ts, Cs, and
Gs) in a piece of DNA. Today, with the right equipment and materials, sequencing a short piece
of DNA is relatively straightforward.
Sequencing an entire genome (all of an organism’s DNA) remains a complex task. It requires
breaking the DNA of the genome into many smaller pieces, sequencing the pieces, and
assembling the sequences into a single long "consensus." However, thanks to new methods that
have been developed over the past two decades, genome sequencing is now much faster and less
expensive than it was during the Human Genome Project\[^1\].
In this article, we’ll take a look at methods used for DNA sequencing. We'll focus on one well-
established method, Sanger sequencing, but we'll also discuss new ("next-generation") methods
that have reduced the cost and accelerated the speed of large-scale sequencing.
Sanger sequencing: The chain termination method
Regions of DNA up to about \[900\] base pairs in length are routinely sequenced using a method
called Sanger sequencing or the chain termination method. Sanger sequencing was developed
by the British biochemist Fred Sanger and his colleagues in 1977.
In the Human Genome Project, Sanger sequencing was used to determine the sequences of many
relatively small fragments of human DNA. (These fragments weren't necessarily \[900\] bp or
less, but researchers were able to "walk" along each fragment using multiple rounds of Sanger
sequencing.) The fragments were aligned based on overlapping portions to assemble the
sequences of larger regions of DNA and, eventually, entire chromosomes.
Although genomes are now typically sequenced using other methods that are faster and less
expensive, Sanger sequencing is still in wide use for the sequencing of individual pieces of
DNA, such as fragments used in DNA cloning or generated through polymerase chain
reaction (PCR).
Sanger sequencing involves making many copies of a target DNA region. Its ingredients are
similar to those needed for DNA replication in an organism, or for polymerase chain reaction
(PCR), which copies DNA in vitro. They include:
A primer, which is a short piece of single-stranded DNA that binds to the template DNA
and acts as a "starter" for the polymerase
_Image credit: "Whole-genome sequencing: Figure 1," by OpenStax College, Biology (CC BY
4.0)._
Dideoxy nucleotides are similar to regular, or deoxy, nucleotides, but with one key difference:
they lack a hydroxyl group on the 3’ carbon of the sugar ring. In a regular nucleotide, the 3’
hydroxyl group acts as a “hook," allowing a new nucleotide to be added to an existing chain.
Once a dideoxy nucleotide has been added to the chain, there is no hydroxyl available and no
further nucleotides can be added. The chain ends with the dideoxy nucleotide, which is marked
with a particular color of dye depending on the base (A, T, C or G) that it carries.
The mixture is first heated to denature the template DNA (separate the strands), then cooled so
that the primer can bind to the single-stranded template. Once the primer has bound, the
temperature is raised again, allowing DNA polymerase to synthesize new DNA starting from the
primer. DNA polymerase will continue adding nucleotides to the chain until it happens to add a
dideoxy nucleotide instead of a normal one. At that point, no further nucleotides can be added, so
the strand will end with the dideoxy nucleotide.
This process is repeated in a number of cycles. By the time the cycling is complete, it’s virtually
guaranteed that a dideoxy nucleotide will have been incorporated at every single position of the
target DNA in at least one reaction. That is, the tube will contain fragments of different lengths,
ending at each of the nucleotide positions in the original DNA (see figure below). The ends of
the fragments will be labeled with dyes that indicate their final nucleotide.
After the reaction is done, the fragments are run through a long, thin tube containing a gel matrix
in a process called capillary gel electrophoresis. Short fragments move quickly through the
pores of the gel, while long fragments move more slowly. As each fragment crosses the “finish
line” at the end of the tube, it’s illuminated by a laser, allowing the attached dye to be detected.
The smallest fragment (ending just one nucleotide after the primer) crosses the finish line first,
followed by the next-smallest fragment (ending two nucleotides after the primer), and so forth.
Thus, from the colors of dyes registered one after another on the detector, the sequence of the
original piece of DNA can be built up one nucleotide at a time. The data recorded by the detector
consist of a series of peaks in fluorescence intensity, as shown in the chromatogram above. The
DNA sequence is read from the peaks in the chromatogram.
Sanger sequencing gives high-quality sequence for relatively long stretches of DNA (up to
about \[900\] base pairs). It's typically used to sequence individual pieces of DNA, such
as bacterial plasmids or DNA copied in PCR.
However, Sanger sequencing is expensive and inefficient for larger-scale projects, such as the
sequencing of an entire genome or metagenome (the “collective genome” of a microbial
community). For tasks such as these, new, large-scale sequencing techniques are faster and less
expensive.
Next-generation sequencing
The name may sound like Star Trek, but that’s really what it’s called! The most recent set of
DNA sequencing technologies are collectively referred to as next-generation sequencing.
There are a variety of next-generation sequencing techniques that use different technologies.
However, most share a common set of features that distinguish them from Sanger sequencing:
Highly parallel: many sequencing reactions take place at the same time
Micro scale: reactions are tiny and many can be done at once on a chip
Fast: because reactions are done in parallel, results are ready much faster
Conceptually, next-generation sequencing is kind of like running a very large number of tiny
Sanger sequencing reactions in parallel. Thanks to this parallelization and small scale, large
quantities of DNA can be sequenced much more quickly and cheaply with next-generation
methods than with Sanger sequencing. For example, in 2001, the cost of sequencing a human
genome was almost \[\$100\] \[\text{million}\]. In 2015, it was just \[\$1245\]\[^2\]!
Why does fast and inexpensive sequencing matter? The ability to routinely sequence genomes
opens new possibilities for biology research and biomedical applications. For example, low-cost
sequencing is a step towards personalized medicine – that is, medical treatment tailored to an
individual's needs, based on the gene variants in his or her genome.
Log in
Sort by:
Top Voted
iristabak
8 years ago
Posted 8 years ago. Direct link to iristabak's post “This might be a bit off t...”
This might be a bit off topic, but I am a chemistry student and I want to get a tattoo of my
father DNA. Now I know the human DNA contains too many basepairs to fit as a tattoo.
But if a small segment of the sequence is used, how likely is it that that sequence also
belongs to a different person?
AnswerButton navigates to signup page•2 commentsComment on iristabak's post “This
might be a bit off t...”
(31 votes)
o Upvote
Button navigates to signup page
o Downvote
Button navigates to signup page
o Flag
Button navigates to signup page
more
o
Laila87
7 years ago
Posted 7 years ago. Direct link to Laila87's post “Technically
speaking, you...”
thejeremiahbender
8 years ago
Posted 8 years ago. Direct link to thejeremiahbender's post “Why can't the die
molecul...”
Why can't the die molecule be attached to a regular nucleotide, and then the entire DNA
chain could be read as a single item?
AnswerButton navigates to signup page•1 commentComment on thejeremiahbender's
post “Why can't the die molecul...”
(9 votes)
o Upvote
Button navigates to signup page
o Downvote
Button navigates to signup page
o Flag
Button navigates to signup page
more
o
Yuvraj Chaudhry
7 years ago
Posted 7 years ago. Direct link to Yuvraj Chaudhry's post “you cannot
read the entir...”
you cannot read the entire dna chain as a single item, even if each base
pair were to be dyed since you would be getting the four colours at
once due to the small size of the molecule. you use ddntp to stop the
synthesis for a strand and get fragments of all possible lengths that
move in ascending order of length. this lets you to exactly know which
base comes at what point.
CommentButton navigates to signup page
(5 votes)
Upvote
Button navigates to signup page
Downvote
Button navigates to signup page
Flag
Button navigates to signup page
more
Show more...
Avi Benshtein
9 years ago
Posted 9 years ago. Direct link to Avi Benshtein's post “Are there two systems to ...”
9 years ago
Posted 9 years ago. Direct link to emilyabrash's post “In traditional
(Sanger) s...”
mekhanaanilkumar
8 years ago
Posted 8 years ago. Direct link to mekhanaanilkumar's post “Does a 2000bp or a
500bp ...”
Matthew Chen
6 years ago
Posted 6 years ago. Direct link to Matthew Chen's post “In other
words, the 500bp...”
Manar Al-Masri
7 years ago
Posted 7 years ago. Direct link to Manar Al-Masri's post “How do we know the
specif...”
How do we know the specific primer if we already don’t know the DNA fragment
sequence ?
AnswerButton navigates to signup page•CommentButton navigates to signup page
(6 votes)
o Upvote
Button navigates to signup page
o Downvote
Button navigates to signup page
o Flag
Button navigates to signup page
more
o
tyersome
7 years ago
Posted 7 years ago. Direct link to tyersome's post “If we want to
amplify a f...”
Rida
2 years ago
Posted 2 years ago. Direct link to Rida's post “"For instance, the human ...”
i dont understand this line. isnt the genome unique to each and every individual? so what
does this mean? did they create one entire human genome and then assumed that the
changes which occur amongst humans are due to mutations(point mutations?) in this
genome?
AnswerButton navigates to signup page•CommentButton navigates to signup page
(2 votes)
o Upvote
Button navigates to signup page
o Downvote
Button navigates to signup page
o Flag
Button navigates to signup page
more
o
++§ Αλεκσανδαρ
2 years ago
Posted 2 years ago. Direct link to ++§ Αλεκσανδαρ's post “Hey, Rida.
You're correc...”
Hey, Rida.
You're correct: we all have a unique DNA in our cells. Even identical
twins have different sequence of nucleotides, due to mutations that
accumulate over time (although those differences are only minute).
However, our DNA isn't exactly entirely composed of genes. In fact,
protein-coding genes make up only about 2 % of our DNA, and this is
the part in which individual people differ very little.
As you can probably tell, sequencing a genome is only one part of the
work, and "completing the human genome" isn't the end of the work.
We are still looking for differences and variations both among genes,
and non-coding parts of DNA. Also, we still haven't figured out what
all those non-coding parts of DNA do.
Speaking of it, the human genome wasn't exactly completed in 2003.
The most part of it was, but some problematic regions took almost an
entire decade to complete. The final assembly was published this year,
in January 2022.
Hope this answers your question. Feel free to ask more in case i left
anything out.
Take care,
Alex
CommentButton navigates to signup page
(9 votes)
Upvote
Button navigates to signup page
Downvote
Button navigates to signup page
Flag
Button navigates to signup page
more
snoozy
7 years ago
Posted 7 years ago. Direct link to snoozy's post “Why do the nucleotides us...”
anja.skevin
7 years ago
Posted 7 years ago. Direct link to anja.skevin's post “It's to provide it
with e...”
It's to provide it with energy. The regular DNA replication also uses
3phosphate groups because by removal of each one of them, it gets
enough energy to bind to the previous nucleotide.
CommentButton navigates to signup page
(5 votes)
Upvote
Button navigates to signup page
Downvote
Button navigates to signup page
Flag
Button navigates to signup page
more
toadere17
6 years ago
Posted 6 years ago. Direct link to toadere17's post “in a chromatogram why are...”
6 years ago
Posted 6 years ago. Direct link to Ivana - Science trainee's post “The
height of each peak i...”
4 years ago
Posted 4 years ago. Direct link to Andrew's post “This may go a bit further...”
This may go a bit further than what the article specified, but here we go. In next-
generation sequencing, how would you prevent reading a double nucleotide. If you add
multiple dNTPs to a single cluster, and the next sequence is AA, how would you be able
to tell the difference between this and just A?
AnswerButton navigates to signup page•CommentButton navigates to signup page
(3 votes)
o Upvote
Button navigates to signup page
o Downvote
Button navigates to signup page
o Flag
Button navigates to signup page
more
o
bozunlu92
3 years ago
Posted 3 years ago. Direct link to bozunlu92's post “depending on the
method y...”
depending on the method you are reading it the answer may change.
but let me give you one example i know
in the ones that we observe the dntp by the light if it is aa it emits more
light if it is a smaller amount of light emitted. if there is aaa there is
even more emitted but more than 4 aaaa is being problem that we can
not observe the differences between emitions. you can watch a small
video on youtube about next generation sequencing i would reccomend
the one with ion torrent, which gives your answer specially
CommentButton navigates to signup page
(1 vote)
Upvote
Button navigates to signup page
Downvote
Button navigates to signup page
Flag
Button navigates to signup page
more
SULAGNA NANDI
2 years ago
Posted 2 years ago. Direct link to SULAGNA NANDI's post “A few questions: 1.
How ...”
A few questions:
1. How many copies of the same sequence do you start with? Just one or millions? If it's
just one, what if your two new strands are cut too short (due to dideoxy nucleotides being
added too early) during the first replication? Isn't that a problem? Or is it okay because
you can just try again with your original two strands?
2. Why is a capillary tube used instead of the normal agarose slab for gel electrophoresis?
3. How does the laser recognize the base (A, T, C, or G) passing by (what's the chemistry
involved)?
4. How can you know for sure that each base in the sequence will get one fragment that
ends with it? What if there are repeats or some bases without a fragment? Or is the
possibility of this negligible?
5. Let's say you have exactly one fragment corresponding to each base in the sequence.
How can you be certain that the fragments will pass by the laser from 5' to 3' in the
sequence (shortest to longest)? What if, by chance, some fragments go out of order?
Wouldn't you get the incorrect sequence?
It'd be great if someone could explain any or all of these, thank you.