How To Map Billioons of Short Reads Onto Genomes

The document discusses the challenges and solutions for mapping billions of short DNA reads produced by next-generation sequencing technologies onto reference genomes. It highlights various software programs available for short-read mapping, such as Bowtie and Maq, and their respective algorithms for efficiently aligning reads. The document also addresses the limitations of current mapping solutions, particularly in handling insertions, deletions, and the complexities of spliced alignments.

Uploaded by

drishtig

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views3 pages

How To Map Billioons of Short Reads Onto Genomes

Uploaded by

drishtig

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

primer

How to map billions of short reads onto

genomes
Cole Trapnell & Steven L Salzberg
Mapping the vast quantities of short sequence fragments produced by next-generation sequencing platforms is a
challenge. What programs are available and how do they work?
© 2009 Nature America, Inc. All rights reserved.

A new generation of DNA sequencers that can

rapidly and inexpensively sequence billions
of bases is transforming genomic science. These
Table 1 A selection of short-read analysis software
Open Handles ABI color Maximum read
new machines are quickly becoming the tech- Program Website source? space? length
nology of choice for whole-genome sequencing Bowtie http://bowtie.cbcb.umd.edu Yes No None
and for a variety of sequencing-based assays, BWA http://maq.sourceforge.net/bwa-man.shtml Yes Yes None
including gene expression, DNA-protein inter- Maq http://maq.sourceforge.net Yes Yes 127
action, human resequencing and RNA splicing Mosaik http://bioinformatics.bc.edu/marthlab/Mosaik No Yes None
studies1–3. For example, the RNA-Seq proto- Novoalign http://www.novocraft.com No No None
col, in which processed mRNA is converted to SOAP2 http://soap.genomics.org.cn No No 60
cDNA and sequenced, is enabling the identifi- ZOOM http://www.bioinfor.com No Yes 240
cation of previously unknown genes and alter-
native splice variants; the ChIP-Seq approach,
which sequences immunoprecipitated DNA In this case, to make sense of the reads, their to understand why the mapping problems are
fragments bound to proteins, is revealing net- positions within the reference sequence must computationally difficult, which difficulties
works of interactions between transcription be determined. This process is known as align- have been overcome and what challenges and
factors and DNA regulatory elements4; and ing or ‘mapping’ the read to the reference. In opportunities remain.
the whole-genome sequencing of tumor cells one version of the mapping problem, reads
is uncovering previously unidentified cancer- must be aligned without allowing large gaps in Challenges of mapping short reads
initiating mutations5. the alignment (we describe this in more detail The first challenge is a practical one: if the
One of the challenges presented by the new in the “Short-read mappers” section below). reference genome is very large, and if we have
sequencing technology is the so-called ‘read A more difficult version of the problem arises billions of reads, how quickly can we align the
mapping’ problem. Sequencing machines made primarily in RNA-Seq, in which alignments are reads to the genome? Sequence alignment is a
by Illumina of San Diego, Applied Biosystems allowed to have large gaps corresponding to classic problem in bioinformatics, supported
(ABI) of Carlsbad, California, and Helicos introns (discussed below in the “Spliced-read by a large body of literature describing different
of Cambridge, Massachusetts, produce short mappers” section). variants for both exact and inexact alignment.
sequences of 25–100 base pairs (bp), called These read mapping problems are certainly As a practical matter, the task of mapping
‘reads’, which are sequence fragments read not new, and there are many programs that billions of sequences to a mammalian-sized
from a longer DNA molecule present in the perform both spliced and unspliced alignment genome calls for extraordinarily efficient algo-
sample that is fed into the machine. In con- for the older Sanger-style capillary reads. Even rithms, in which every bit of memory is used
trast to whole-genome assembly, in which these so, these programs neither scale up to the much optimally or near optimally.
reads are assembled together to reconstruct a greater volumes of data produced by short- The second challenge is strategic: if a read
previously unknown genome, many of the read sequencers nor scale down to the short comes from a repetitive element in the refer-
next-generation sequencing projects begin read lengths. Aligning the reads from ChIP-Seq ence, a program must pick which copy of the
with a known, or so-called ‘reference’, genome. or RNA-Seq experiments can take hundreds or repeat the read belongs to. Because this may
thousands of central processing unit (CPU) be impossible to decide with confidence, the
Cole Trapnell and Steven L. Salzberg are at the hours using conventional software tools such program may choose to report multiple pos-
Center for Bioinformatics and Computational as BLAST or BLAT. Fortunately, new software sible locations or to pick a location heuristi-
Biology, University of Maryland, College Park, packages designed to meet the computational cally. Sequencing errors or variations between
Maryland, USA. challenges of short-read sequencing are quickly the sequenced chromosomes and the reference
e-mail: [email protected] or [email protected] appearing. Before choosing one, it is essential genome exacerbate this problem, because the

nature biotechnology volume 27 number 5 may 2009 455

pr i mer

alignment between the read and its true source using traditional alignment algorithms such as Short-read mappers
in the genome may actually have more differ- BLAST or BLAT, but such grids are not acces- Such programs as Maq and Bowtie (Table 1)
ences than the alignment between the read sible to everyone. To reduce the computing cost use a computational strategy known as ‘index-
and some other copy of the repeat. The spliced of analysis for sequencing-based assays and to ing’ to speed up their mapping algorithms. Like
mapping problem faces this same challenge but make them available to all investigators, we and the index at the end of a book, an index of a
is further complicated by the possible presence others have created a new generation of align- large DNA sequence allows one to rapidly find
of intron-sized gaps. ment programs capable of mapping hundreds shorter sequences embedded within it. Maq is
DNA sequencers from Illumina, ABI, Roche of millions of short reads on a single desktop based on a straightforward but effective strategy
(of Basel, Switzerland), Helicos and other compa- computer. Vendors of sequencing machines called spaced seed indexing6 (Fig. 1a). In this
nies produce millions of reads per run. Complete provide specialized mapping software, such as strategy, a read is divided into four segments of
assays may involve many runs, so an investigator the ELAND program from Illumina, but in this equal length, called the ‘seeds’. If the entire read
may need to map millions or billions of reads article we focus on third-party packages, some of aligns perfectly to the reference genome, then
to a genome. For example, the recent cancer which are free and open source. These programs clearly all of the seeds will also align perfectly.
genome sequencing project by Ley et al.5 gener- are built on algorithms that exploit features of If there is one mismatch, however, perhaps due
ated nearly 8 billion reads from 132 sequencing short DNA sequencing reads to map millions of to a single-nucleotide polymorphism (SNP),
runs. A large, expensive computer grid might reads per hour while minimizing both process- then it must fall within one of the four seeds,
map the reads from this experiment in a few days ing time and memory requirements. but the other three will still match perfectly.
Using similar reasoning, two mismatches will
a Spaced seeds b Burrows-Wheeler fall in at most two seeds, leaving the other two
© 2009 Nature America, Inc. All rights reserved.

to match perfectly. Thus, by aligning all pos-

Reference genome Short read Reference genome Short read
(> 3 gigabases) (> 3 gigabases) sible pairs of seeds (six possible pairs) against
Chr1 ACTCCCGTACTCTAAT Chr1 ACTCCCGTACTCTAAT the reference, it is possible to winnow the list of
Chr2 Chr2
Chr3 Chr3
candidate locations within the reference where
Chr4 Chr4 the full read may map, allowing at most two
Concatenate into mismatches. Maq’s spaced seed index enables
Extract seeds single string it to perform this winnowing operation very
efficiently. The resulting set of candidate reads
is typically small enough that the rest of the
Position N
Burrows-Wheeler read—that is, the other two seeds that might
Position 2 transform and indexing
CTGC CGTA AACT AATG
contain the mismatches—may be individually
Bowtie index checked against the reference.
Position 1
(~2 gigabytes) Bowtie takes an entirely different approach,
ACTG CCGT AAAC TAAT ACTC CCGT ACTC TAAT ACTCCCGTACTCTAAT

ACTG AAAC 1 T borrowing a technique originally developed

**** CCGT
**** TAAT Six seed 2 Look up AT for compressing large files called the Burrows-
ACTG **** **** TAAT pairs per 3 ‘suffixes’ AAT
read/ 4 of read •
Wheeler transform. Using this transform, the
**** **** AAAC TAAT
ACTG CCGT **** **** fragment 5 • index for the entire human genome fits into
6 •
**** CCGT AAAC ****
ACTCCCGTACTCTAAT less than two gigabytes of memory (an amount
Index seed pairs Hits identify that is commonly available on today’s desktop
positions in and even laptop computers)—in contrast to a
Seed index genome where
(tens of gigabytes) Look up each pair read is found spaced seed index, which may require over 50
of seeds in index gigabytes—and yet reads can still be aligned
ACTG **** AAAC ****
• Hits identify positions efficiently. Bowtie aligns a read one character
• in genome where
• at a time to the Burrows-Wheeler–transformed
• spaced seed pair
• is found genome (Fig. 1b). Each successively aligned
•
**** CCGT **** TAAT new character allows Bowtie to winnow the
ACTG **** **** TAAT Confirm hits Convert each list of positions to which the read might map.
**** CCGT AAAC **** by checking hit back to
“****” positions If Bowtie cannot find a location where a read
genome location
aligns perfectly, the algorithm backtracks
Report alignment to user to a previous character of the read, makes a
substitution and resumes the search. In effect,
Figure 1 Two recent algorithmic approaches for aligning short (20–200-bp) sequencing reads.
the Burrows-Wheeler transform enables
(a) Algorithms based on spaced-seed indexing, such as Maq, index the reads as follows: each position
in the reference is cut into equal-sized pieces, called ‘seeds’ and these seeds are paired and stored
Bowtie to conquer the mapping problem by
in a lookup table. Each read is also cut up according to this scheme, and pairs of seeds are used as first solving a simple subproblem—align one
keys to look up matching positions in the reference. Because seed indices can be very large, some character—and then building on that solution
algorithms (including Maq) index the reads in batches and treat substrings of the reference as queries. to solve a slightly harder problem—align two
(b) Algorithms based on the Burrows-Wheeler transform, such as Bowtie, store a memory-efficient characters—and then continuing on to three
representation of the reference genome. Reads are aligned character by character from right to left characters, and so on, until the entire read has
against the transformed string. With each new character, the algorithm updates an interval (indicated
been aligned. Bowtie’s alignment algorithm is
by blue ‘beams’) in the transformed string. When all characters in the read have been processed,
alignments are represented by any positions within the interval. Burrows-Wheeler–based algorithms can substantially more complicated than Maq’s, but
run substantially faster than spaced seed approaches, primarily owing to the memory efficiency of the Bowtie’s alignment speed is more than 30-fold
Burrows-Wheeler search. Chr., chromosome. faster7.

456 volume 27 number 5 may 2009 nature biotechnology

pr i mer

Maq and Bowtie both report alignments with does not rely on annotations. Instead, it
up to two mismatches when run in their default uses Bowtie (in an initial alignment pass) to
modes. In some alignment scenarios, a user may Exon A Exon B Exon C identify exons that fully contain some of the
need to allow more mismatches. These two pro- Processed mRNA reads, and then aligns the remaining reads
grams were originally designed for reads between to junctions between those exons9. Another
20 and 40 bp long, and both were optimized for program, G-Mo.R-Se (http://www.genoscope.
human resequencing projects. Even so, Illumina cns.fr/externe/gmorse), performs a similar
sequencers can now produce reads longer than Mapping to genome spliced alignment while constructing gene
100 bp. Additionally, some sequencing projects models from RNA-Seq data10.
(such as bacterial or fungal genome sequencing) Figure 2 RNA-Seq assays produce short reads
produce sequences that have many nucleotide- sequenced from processed mRNAs. Aligning Limitations and open problems
these reads to the genome with Bowtie or Maq will
level differences with respect to the closest fully The current solutions for short-read mapping all
produce the alignments shown in black but will
sequenced genome. Finally, the overall quality fail to align the blue reads. A spliced-read mapper have limitations. Mapping programs such as Maq
of reads produced by the new technologies is such as TopHat or ERANGE will also report the and Bowtie offer very limited support for align-
sensitive to factors such as library preparation, (blue) alignments spanning intron boundaries. ing reads with insertions or deletions (indels).
sequencing protocol and even the temperature Some read mappers, such as SHRiMP (http://
of the room housing the sequencing machine. the SAM tools (http://samtools.sourceforge.net). compbio.cs.toronto.edu/shrimp), support ABI’s
Thus, it is essential to know how to change the SAM includes a consensus base caller and viewer ‘color space’ sequence representation, but most
various default options for any short-read map- that can be used either with Maq or with Bowtie. do not. The spliced alignment programs suffer
© 2009 Nature America, Inc. All rights reserved.

per and to be able to identify when those defaults Most read mapping software is designed with from these same problems and add a few of their
are no longer appropriate. whole-genome resequencing in mind, but the own. Annotation-based methods are of course
Several of the new short-read mappers programs can be configured for other assays. The only as good as the annotations, and many
(Table 1) are open source, are simple to install manuals for Bowtie and Maq are quite detailed, organisms have annotations supported only
and have good documentation and active user and the array of choices a user can make can be by homology or computational predictions.
communities. The installation package for daunting. Moreover, the list of programs capa- Machine learning methods will perform poorly
Bowtie includes a prebuilt index for Escherichia ble of short-read mapping is rapidly growing if they are trained on incorrect annotations, and
coli and a set of sample E. coli reads. To run the (Table 1), and not every program is ideal or they are prone to overtraining.
program on the sample data, just enter the fol- appropriate for every experiment. Fortunately, Many challenges and questions remain for
lowing on the command line: there are ways to get help. The SeqAnswers developers of read mapping software. As all the
message board (http://www.seqanswers.com) sequencing machine vendors are trying to pro-
bowtie e_coli reads/e_coli_1000.fq is an excellent resource for novice and expert duce longer reads, will the short-read mapping
users, frequented by the developers of many programs scale well as the reads get longer? Maq,
This command will produce a tabular report short-read mapping programs. One of the most Bowtie and several other short-read packages
showing each matching read’s identifier, the popular SeqAnswers threads contains a catalog support reads longer than 100 bp, but at some
position(s) where it aligns to the reference of current software for primary analysis and point, software designed for longer reads, such as
sequence, and the number and location of mis- visualization of short-read data. BLAT, may be a better fit for downstream analy-
matches. Maq reports this same information sis. Furthermore, when mapping reads from an
when you run it with the command: Spliced-read mappers organism that has diverged significantly from
The spliced alignment problem, in which its reference genome, how should a program’s
maq.pl easyrun -d outdir cDNA (from processed mRNA) sequences are parameters be adjusted, and can that adjustment
reference.fasta reads.fastq aligned back to genomic DNA, requires more happen automatically? How useful is mapping
specialized algorithms. Reads sampled from quality in downstream analysis, and should it
For a given experiment, the fraction of reads exon-exon junctions need to be mapped dif- be computed while aligning reads, as Maq does,
that align to the genome depends on many fac- ferently from reads that are contained entirely or later? The answers to each of these questions
tors. Assuming the sequenced DNA does not within exons (Fig. 2). will depend on the type of assay and the scale
contain many mismatched nucleotides com- To align cDNA reads from RNA-Seq1–3 of the analysis, and as long as the technology
pared to the reference, and assuming the reads experiments, packages such as ERANGE continues to change, the programs will have to
have passed rudimentary quality filters, most (http://woldlab.caltech.edu/rnaseq) use change rapidly to keep up.
mapping software will find an alignment for the positions of exons and introns within 1. Nagalakshmi, U. et al. Science 320, 1344–1349 (2008).
70–75% of the reads. This might seem surpris- known genes as a guide. This allows ERANGE 2. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. &
ingly low, but the sequencing technology is still to construct the sequences spanning exon- Wold, B. Nat. Methods 5, 621–628 (2008).
3. Wang, E.T. et al. Nature 456, 470–476 (2008).
immature—and it’s worth noting that Sanger exon junctions and use them as reference 4. Johnson, D.S., Mortazavi, A., Myers, R.M. & Wold, B.
sequencing had success rates of less than 80% sequences, and then to invoke a standard read Science 316, 1497–1502 (2007).
until the late 1990s. Note that many reads will mapper such as Maq or Bowtie to align the 5. Ley, T.J. et al. Nature 456, 66–72 (2008).
6. Li, H., Ruan, J. & Durbin, R. Genome Res. 18, 1851–1858
align to multiple positions in the genome. Most spliced reads2. Because this approach will not (2008).
read mappers can be directed to report align- discover entirely new splice junctions, some 7. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L.
Genome Biol. 10, R25 (2009).
ments only for reads that map to a unique loca- studies have used machine learning meth- 8. Pan, Q., Shai, O., Lee, L.J., Frey, B.J. & Blencowe, B.J. Nat.
tion in the genome. ods to predict possible junctions by training Genet. 40, 1413–1415 (2008).
After aligning the reads, next one might want statistical models using available reference 9. Trapnell, C., Pachter, L. & Salzberg, S.L. Bioinformatics pub-
lished online, doi:10.1093/bioinformatics/btp120 (March
to call SNPs or view the alignments against the annotations8. In contrast, the TopHat spliced- 16, 2009).
reference sequence. One package for this task is read mapper (http://tophat.cbcb.umd.edu) 10. Denoeud, F. et al. Genome Biol. 9, R175 (2008).

nature biotechnology volume 27 number 5 may 2009 457

PDF Manuals Restriction-Mapping-Plasmid
0% (1)
PDF Manuals Restriction-Mapping-Plasmid
23 pages
Bioinformatics Pratical File
No ratings yet
Bioinformatics Pratical File
63 pages
Biochemistry of Vision
No ratings yet
Biochemistry of Vision
71 pages
4 - 7 Genome Assembly To Annotation - Final
No ratings yet
4 - 7 Genome Assembly To Annotation - Final
92 pages
Lecture 28 Unit6 1
No ratings yet
Lecture 28 Unit6 1
16 pages
Bioinformatics: ABE 2007 Kent Koster Group 3
No ratings yet
Bioinformatics: ABE 2007 Kent Koster Group 3
43 pages
Dart-Pim: Dna Read Mapping Accelerator Using Processing-In-Memory
No ratings yet
Dart-Pim: Dna Read Mapping Accelerator Using Processing-In-Memory
14 pages
Deep Sequencing: Introduction To Bioinformatics Seminar November 9th, 2009
No ratings yet
Deep Sequencing: Introduction To Bioinformatics Seminar November 9th, 2009
56 pages
Sequenciamento para Bioinformatas
No ratings yet
Sequenciamento para Bioinformatas
144 pages
Pcbi 1000186
No ratings yet
Pcbi 1000186
5 pages
Microreads ALLPATHS: de Novo Assembly of Whole-Genome Shotgun
No ratings yet
Microreads ALLPATHS: de Novo Assembly of Whole-Genome Shotgun
12 pages
Plant Biotechnology
No ratings yet
Plant Biotechnology
44 pages
Pone 0047768
No ratings yet
Pone 0047768
12 pages
MAQ - Heng Li
No ratings yet
MAQ - Heng Li
9 pages
Advanced Applications of RNA Sequencing
No ratings yet
Advanced Applications of RNA Sequencing
18 pages
Bio Model
No ratings yet
Bio Model
12 pages
Brief Guide For NGS Transcriptomics: From Gene Expression To Genetics
No ratings yet
Brief Guide For NGS Transcriptomics: From Gene Expression To Genetics
120 pages
Application in Establishing Epidemiology and Variability: Genome & Protein " Sequence Analysis Programs"
100% (3)
Application in Establishing Epidemiology and Variability: Genome & Protein " Sequence Analysis Programs"
23 pages
Bio Tools Booklet
No ratings yet
Bio Tools Booklet
5 pages
Bioinformatics Workshops
No ratings yet
Bioinformatics Workshops
49 pages
Introduction To Different Resources of Bioinformatics and Application PDF
No ratings yet
Introduction To Different Resources of Bioinformatics and Application PDF
55 pages
Assembly of Large Genomes Using Second-Generation Sequencing PDF
No ratings yet
Assembly of Large Genomes Using Second-Generation Sequencing PDF
10 pages
Research Article: RECORD: Reference-Assisted Genome Assembly For Closely Related Genomes
No ratings yet
Research Article: RECORD: Reference-Assisted Genome Assembly For Closely Related Genomes
10 pages
Bioinformatics 29 1 15
No ratings yet
Bioinformatics 29 1 15
7 pages
RNA-Seq Module 1
No ratings yet
RNA-Seq Module 1
54 pages
Genome Sequencing and Objectives
No ratings yet
Genome Sequencing and Objectives
18 pages
Lecture1 Genome - Sequencing 2019
No ratings yet
Lecture1 Genome - Sequencing 2019
41 pages
NRG 2016 49
No ratings yet
NRG 2016 49
19 pages
Aanchal Maurya Bioinformatics 2
No ratings yet
Aanchal Maurya Bioinformatics 2
24 pages
Long Read Sequencing in Deciphering Human Genetics To A Greater Depth
No ratings yet
Long Read Sequencing in Deciphering Human Genetics To A Greater Depth
15 pages
Software: Next-Generation Sequence Alignment Software
No ratings yet
Software: Next-Generation Sequence Alignment Software
3 pages
Sequence Analysis Primer, 1st Edition Full Download
100% (8)
Sequence Analysis Primer, 1st Edition Full Download
17 pages
Lecture 3 - Genome Mapping
No ratings yet
Lecture 3 - Genome Mapping
47 pages
Bif401 Manual 2023
No ratings yet
Bif401 Manual 2023
27 pages
Soon Et Al 2013 High Throughput Sequencing For Biology and Medicine
No ratings yet
Soon Et Al 2013 High Throughput Sequencing For Biology and Medicine
14 pages
BBP 026
No ratings yet
BBP 026
13 pages
CE6068 Lecture 4
No ratings yet
CE6068 Lecture 4
82 pages
Sequence Analysis Primer 1st Edition ISBN 0195098749, 9780195098747 Full Text Download
No ratings yet
Sequence Analysis Primer 1st Edition ISBN 0195098749, 9780195098747 Full Text Download
16 pages
Blast Introduction
No ratings yet
Blast Introduction
42 pages
DNA Sequencing Next Generation Sequencing
No ratings yet
DNA Sequencing Next Generation Sequencing
31 pages
2023-GenomicaFuncional y Biocomputacion-Day1
No ratings yet
2023-GenomicaFuncional y Biocomputacion-Day1
92 pages
Bioinformatics New Tools and Applications in Life
No ratings yet
Bioinformatics New Tools and Applications in Life
16 pages
Data Retrieval
67% (3)
Data Retrieval
17 pages
Documents - Pub Introduction To Next Generation Sequencing and Variant Calling Karin Kassahn
No ratings yet
Documents - Pub Introduction To Next Generation Sequencing and Variant Calling Karin Kassahn
74 pages
Trapnell 2024 TopHat Discovering Splice Junction Wiht RNaSeq
No ratings yet
Trapnell 2024 TopHat Discovering Splice Junction Wiht RNaSeq
7 pages
Freedman 2024
No ratings yet
Freedman 2024
9 pages
NGS HB Enabling Next Gen Sequencing Workflow CY12900 19apr22 HB
No ratings yet
NGS HB Enabling Next Gen Sequencing Workflow CY12900 19apr22 HB
52 pages
A Survey of Whole Genome Alignment Tools and Frameworks Based On Hadoop'S Mapreduce
No ratings yet
A Survey of Whole Genome Alignment Tools and Frameworks Based On Hadoop'S Mapreduce
6 pages
DNA Sequencing
No ratings yet
DNA Sequencing
20 pages
The Need For Collecting and Storing The Sequence of DNA Molecules in Computer Files
No ratings yet
The Need For Collecting and Storing The Sequence of DNA Molecules in Computer Files
9 pages
Implementation of A Read Mapping Tool Based On The Pigeon-Hole Principle
No ratings yet
Implementation of A Read Mapping Tool Based On The Pigeon-Hole Principle
38 pages
Lab03 - Lab Manual
No ratings yet
Lab03 - Lab Manual
16 pages
Bonet Ta 2006
No ratings yet
Bonet Ta 2006
7 pages
J Humimm 2021 02 012
No ratings yet
J Humimm 2021 02 012
11 pages
Litrature and Design
No ratings yet
Litrature and Design
17 pages
The RNA World 11th Lect High-Throughput Methods GH AY16 2017
No ratings yet
The RNA World 11th Lect High-Throughput Methods GH AY16 2017
59 pages
Brief Bioinform-2010-Li-473-83
No ratings yet
Brief Bioinform-2010-Li-473-83
11 pages
GenPIP - In-Memory Acceleration Acceleration of Genome Analysis Via Tight Integration of Basecalling and Read Mapping
No ratings yet
GenPIP - In-Memory Acceleration Acceleration of Genome Analysis Via Tight Integration of Basecalling and Read Mapping
17 pages
Bioinformatics Lab 2 (Evelyn)
No ratings yet
Bioinformatics Lab 2 (Evelyn)
9 pages
Lincoln Stein - Genome Annotation: From Sequence To Biology
No ratings yet
Lincoln Stein - Genome Annotation: From Sequence To Biology
13 pages
Bioinformatics in Aquaculture: Principles and Methods
From Everand
Bioinformatics in Aquaculture: Principles and Methods
Zhanjiang (John) Liu
No ratings yet
RWKV Architecture and Applications: The Complete Guide for Developers and Engineers
From Everand
RWKV Architecture and Applications: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Lecture 07 Pushdown, CFG
No ratings yet
Lecture 07 Pushdown, CFG
28 pages
OS Scurity
No ratings yet
OS Scurity
12 pages
Master Documentation
No ratings yet
Master Documentation
4 pages
PRACTICE QUESTIONS - MTE - ECE-103 Editable
No ratings yet
PRACTICE QUESTIONS - MTE - ECE-103 Editable
1 page
Goodsell-2009-Biochemistry and Molecular Biology Education PDF
No ratings yet
Goodsell-2009-Biochemistry and Molecular Biology Education PDF
8 pages
GB 50119-2003-En
No ratings yet
GB 50119-2003-En
40 pages
Biomolecules - Important Questions
No ratings yet
Biomolecules - Important Questions
7 pages
Cell Biology Course Schedule
No ratings yet
Cell Biology Course Schedule
3 pages
Introduction To Endocrinology Lecture
No ratings yet
Introduction To Endocrinology Lecture
5 pages
Amino Acid Synthesis
No ratings yet
Amino Acid Synthesis
70 pages
Replication
No ratings yet
Replication
2 pages
BS20001 Science of Living Systems ClassTest1 Autumn 2016 A Solutions - IITKGP
No ratings yet
BS20001 Science of Living Systems ClassTest1 Autumn 2016 A Solutions - IITKGP
3 pages
Introduction To Molecular Biology
No ratings yet
Introduction To Molecular Biology
5 pages
Hydrolysis of Nucleic Acids: Group 8
No ratings yet
Hydrolysis of Nucleic Acids: Group 8
25 pages
Enzymes Are Biological Catalyst That Alters The Rate of A Chemical Reaction
No ratings yet
Enzymes Are Biological Catalyst That Alters The Rate of A Chemical Reaction
9 pages
Native Bacillus Licheniformis Protease Creative Enzymes
No ratings yet
Native Bacillus Licheniformis Protease Creative Enzymes
1 page
CV - Phillip Sharp
No ratings yet
CV - Phillip Sharp
20 pages
WBI12 01 Que 20211008
No ratings yet
WBI12 01 Que 20211008
28 pages
Soalan BK1 Finalyre15
No ratings yet
Soalan BK1 Finalyre15
12 pages
Bab 3 Objektif
No ratings yet
Bab 3 Objektif
11 pages
CAPE Biology 2015 U1 P2
No ratings yet
CAPE Biology 2015 U1 P2
20 pages
Mitosis Worksheet
No ratings yet
Mitosis Worksheet
9 pages
IBDP Biology Revision Guide (SL) - Knowledge and Application
100% (1)
IBDP Biology Revision Guide (SL) - Knowledge and Application
43 pages
From RNA-seq Reads To Gene Expression
No ratings yet
From RNA-seq Reads To Gene Expression
27 pages
Biochem Isoenzymes
No ratings yet
Biochem Isoenzymes
21 pages
Ulcer 1
No ratings yet
Ulcer 1
23 pages
Fish & Shellfish Immunology
No ratings yet
Fish & Shellfish Immunology
10 pages
Cell Signaling I
100% (1)
Cell Signaling I
21 pages
Deep Lerning Annottation
No ratings yet
Deep Lerning Annottation
11 pages
Polymerase Chain Reaction
No ratings yet
Polymerase Chain Reaction
21 pages
Media 12533 Uk Practice Guidelines For Variant Classification v12 2024
No ratings yet
Media 12533 Uk Practice Guidelines For Variant Classification v12 2024
54 pages
Lecture 6 DNA Technology and Functional Genomics
No ratings yet
Lecture 6 DNA Technology and Functional Genomics
20 pages

How To Map Billioons of Short Reads Onto Genomes

Uploaded by

How To Map Billioons of Short Reads Onto Genomes

Uploaded by

primer

How to map billions of short reads onto

A new generation of DNA sequencers that can

nature biotechnology volume 27 number 5 may 2009 455

to match perfectly. Thus, by aligning all pos-

ACTG **** AAAC **** 1 T borrowing a technique originally developed

456 volume 27 number 5 may 2009 nature biotechnology

nature biotechnology volume 27 number 5 may 2009 457

You might also like

ACTG AAAC 1 T borrowing a technique originally developed