Sequencing
Sequencing
• Goal
figuring the order of nucleotides across a genome
• Problem
Current DNA sequencing methods can handle only short stretches of DNA at once (<1-2Kbp)
• Solution
Sequence and then use computers to assemble the small pieces
115
A quick history of sequencing
• 1869 – Discovery of DNA
• 1909 – Chemical characterisation
• 1953 – Structure of DNA solved
• 1977 – Sanger sequencing invented
• – First genome sequenced – ФX174 (5 kb)
• 1986 – First automated sequencing machine
• 1990 – Human Genome Project started
• 1992 – First “sequencing factory” at TIGR
A quick history of sequencing
Genome Sequencing
TG..GT TC..CC
AC..GC
CG..CA
TT..TC
TG..AC
AC..GC GA..GC
CT..TG
GT..GC AC..GC AC..GC
AA..GC AT..AT
TT..CC
Sequenced genome
118 118
Manual Sanger Sequencing
• 800-1000 nucleotide
Principles
119
Sanger Sequencing
127
Sanger Sequencing
• Advantages
Long reads (~900bps)
Suitable for small DNA fragments
• Disadvantages
Low throughput
Expensive
Not suitable for long DNA fragments
130
Sanger Sequencing
2007: Global Ocean Sampling
1994: H. Influenzae Expedition
1.8 Mbp ~3,000 organisms, 7Gbp (Venter et al.)
(Fleischmann et al.)
131
Assembly: How Much DNA?
Input Output
Low coverage:
A few pieces to
assemble many contigs, many
gaps
High coverage:
many pieces to
assemble
a few contigs, a few
gaps
132
Lander and Waterman, 1988
Next Genome Sequencing
1) Next Generation Sequencing (NGS) - An Introduction.mp4
• NGS is a general term referring to all post-Sanger sequencing technologies that enable massive
sequencing at low cost.
Next-Generation Sequencing
• Ion Torrent: sequencing measures the direct release of H+ (protons) from the
incorporation of individual bases by DNA polymerase and therefore differs from the
previous two methods as it does not measure light.
(454 sequencing) pyrosequencing
pyrosequencing
+
pyrophosphate APS
(released by dNTP incorporation) adenosine
5`-phosphosulfate
ATP sulfurylase
+
sulfate
ATP
pyrosequencing
O
+ 2
+
ATP oxygen luciferin
firefly luciferase
Ronaghi M. Genome Res 11:3-11, 2001 Peaks in pyrogram reflex the Nucleotide sequence
pyrosequencing
more biochemistry
problem solution
apyrase
free dNTP breaks down ATP to
AMP + 2 Pi
(or wash out solution)
dATPαS
Application
Human genome sequencing
High accurency: 99.9% with 200 base-fragment and 99% with 400 base-
fragment.
Disadvantages
Difficult for repeat sequence: >6 Nucleotides
Sequencing by synthesis
Sample preparation
Fluorescence beam
Nanopore – 3rd generation sequencing
Voltage apply