A Survey of Whole Genome Alignment Tools and Frameworks Based On Hadoop'S Mapreduce

This document discusses whole genome alignment tools and frameworks based on Hadoop's MapReduce. It provides an overview of several whole genome alignment software tools and how they implement MapReduce in big data. The tools discussed include Progressive Mauve, Alpha, Gecko, ANITools, Reveal, HS-BLASTN, and PaSWAS. They vary in their interface, operating systems, programming languages, and output file formats. Most provide progressive genome alignment and are stable desktop applications, with some offering parallelization or web interfaces.

Uploaded by

Ashraf Sayed Abdou

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views6 pages

A Survey of Whole Genome Alignment Tools and Frameworks Based On Hadoop'S Mapreduce

Uploaded by

Ashraf Sayed Abdou

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

IWBIS 2016 c 2016 IEEE

978-1-5090-3477-2/16/$31.00

A Survey of Whole Genome Alignment Tools and

Frameworks based on Hadoop’s MapReduce
Sumarsih C. Purbarani Hadaiq R. Sanabila
Faculty of Computer Science Faculty of Computer Science
Universitas Indonesia Universitas Indonesia
Depok, Indonesia Depok, Indonesia
[email protected] [email protected]

Anom Bowolaksono Budi Wiweko

Department of Biology Faculty of Medicine
Faculty of Mathematics and Natural Sciences Universitas Indonesia
Universitas Indonesia Jakarta, Indonesia
Depok, Indonesia

Abstract—Next generation DNA sequencing (NGS) project might be still feasible if only several time of transferring some
that aims to give understandings in various genes seems to boosts set of genome through the Internet, yet the fact is that the data is
innovative breakthrough in whole genome issues. Dealing with ge- keep on growing bigger each time and the demand to store this
nomic data requires large-scale data storage and processing. Big enormous size of data is an urgency in this era where relying
data technology could be the most appropriate solution to gaining only on cloud storage is impractical [3]. Besides the size, the
useful knowledge from data comprehensively. This study discusses security of cloud storage and computing is yet another issue.
about genome tools and framework that implement MapReduce of Dealing with these problems urges bioinformatics researchers to
Hadoop’s components in sequence alignment computation. The harness big data technology that is not only able to securely store
aim of this discussion is presenting an overview of whole genome
massive data in any format but also able to provide further un-
alignment software tools and the implementation in big data.
derstanding of data to gain useful knowledge comprehensively.
Keywords—genome sequence alignment; multiple sequence A. Big Data Framework
alignment; MapReduce; Hadoop; big data
Hadoop as one of big data framework giants currently is be-
I. INTRODUCTION ing utilized in almost all fields of data mining involving big size
of data. Hadoop provides data storage and processing that in-
Recent years, DNA sequencing has become more popular volves very large datasets. Hadoop deals with batches that bring
especially in the field of bioinformatics. Next generation DNA through the flexibility in accepting all data format. It means bi-
sequencing (NGS) project that aims to give understandings in oinformatics or biomedical data processing is quite suitable to
various genes seems to boosts itself and attracts many research- work with it since data in bioinformatics and biomedical consists
ers to quest for more innovative breakthrough in dealing with of both structured and unstructured data. It includes genomic
whole genome related issues [1]. data in the first place, disease, patients’ health record, clinical
Single Nucleotide Polymorphism (SNP) is one of the com- information, molecules, various cells, tissues, etc. which are
ponents that stimulates the hugeness of the project. SNP is a var- very different in size, unit, and format. Some of them might be
iation at a single position in DNA sequence among individuals imagery data while some others are unstructured hand-writing
[2]. A single sequence of SNPs in can determine the traits an data and so on.
individual might has. The most significant SNPs are the bi- This study discusses about genome tools used in sequence
omarkers for scientists to locate abnormal genes, usually in as- alignment. The aim of this discussion is presenting an overview
sociation with diseases. There are roughly 10 million SNPs of whole genome alignment software tools and the implementa-
found in only a human genome. When it comes to bioinformatics tion in big data. This article is arranged as follows: in the first
where thousands to billions human’s genomic data needs to be section some fundamental knowledge in big data and genome
stored, it involves not only terabytes of data but petabytes or alignment are introduced in a brief explanation. In the second
even exabyte. One of the world’s largest biology-data reposito-
ries, European Bioinformatics Institute (EBI), for instance, d d ' ' d
stores 20 petabytes of data by year 2013 and rises up to 75
petabytes in 2015 [3].
d d ' d
Solutions to the large data management especially in the era
of data explosion as today are always a big challenge. Cloud
computing have been proposed to solve the issue. It gained a Figure 1. Single Sequence Alignment. Nucleotides in
good appreciation as it can store large dataset in the cloud. It red show matched pairs. Empty boxes represent gaps.

65
IWBIS 2016 c 2016 IEEE
978-1-5090-3477-2/16/$31.00

TABLE I. MOST RECENT GENOME ALIGNMENT PACKAGES

Progressive
Alpha Gecko ANITools Reveal HS-BLASTN PaSWAS
Mauve
Application Desktop Desktop Desktop Desktop Web Desktop Desktop Desktop
Interface CLI CLI CLI CLI Web UI CLI CLI CLI
Restrictions to None None None Academic Users Only None None None
use
Operating Unix/Linux, Unix/Linux, Unix/Linux Unix/Linux - Unix/Linux, Unix/Linux Unix/Linux
System Max OS, Max OS Unix/Linux,
Windows Max OS
Programming Java Python C Perl Java C, Python C++ Python
Language
Parallelization NA NA OpenStack NA NA NA NA CUDA
Computer Advanced Advanced Advanced Advanced Basic Advanced Advanced Advanced
Skills
Stability Stable Stable Stable Stable Stable Stable Stable Stable
Allignment Progressive Progressive Progressive Progressive Progressive Progressive Progressive Progressive
Type allignment allignment allignment allignment allignment allignment allignment allignment
consctruction consctruction consctruction construction construction construction construction construction
Output File XMFA .pdf (saving .frags, Phylogenetic Phylogenetic .gfa .fa (with 3 kind of .txt and
Format (eXtenden the .frags.INF, tree tree on the (graphical format: standard Sequence
Multi-FastA) allignment .csb.frags, webpage fragment pairwaise Alignment
visualization) .csb.frags.INF, assembly) alignment, tabular Map
.csv, output with (.SAM)
.frags.MAT comment, tabulat
output without
comments)
Input File .gbk .fasta .fasta, .fna, .fasta, .fna, .fasta, .fna, .fasta and .fasta, .fna, and .fa .fasta, .fna
Format (GenBank and .fa and .fa and .fa .fa and .fa
Format), .fna
() .raw (raw
sequence
data)
Allignment HMM Partial order High Scoring BLAST and BLAST and Recursive BLAST Parallel
Algorithm allignment Segment Pairs suffix tree suffix tree maximally Smith-
graph algorithm algorithm unique Waterman
mathcing

section, the most recent whole genome alignment software tools A. ProgressiveMauve
are reviewed. In the third section of this article, some frame- Being launched in 2010, this package was quite popular back
works that utilize MapReduce of Hadoop’s are presented. In the in time. Darling et al [4] declares that their tool can tackle the
last section, the conclusion is drawn and suggestion in selecting alignment issue of sequences that have undergone rearrange-
the most appropriate tools is presented. ment, gene gain, and loss. The tool implements an objective
II. SINGLE-MULTIPLE SEQUENCE ALIGNMENT scoring system called sum-of-pairs breakpoint score. It is
claimed that it can accurately detects if there is any rearrange-
In order to find how similar or distant two individuals are, ment breakpoint when the sequences have unequal gene content.
alignment of the whole genome sequence(s). A single alignment This is done by identifying local anchors, shared by a subset of
engages two whole genome sequences to see the similarity in the target sequence. To remove erroneous alignments of unre-
between. Multiple Sequence Alignment (MSA) on the other lated genome sequences, a probabilistic alignment filtering
hand, utilizes many single alignments. MSA aligns the results of method is also implemented [4].
each two single alignments. The alignment score can be match,
mismatch, or gap since there is also possibility that a sequence B. ALignment of PHAges (Alpha)
has no pair in the other sequence (see Figure 1). This tool mostly proposes a multiple sequence alignment on
bacteriophages (virus that attacks bacteria) whose similarity in
III. GENOME ALIGNMENT TOOLS its genomic comparison does not necessarily mean having simi-
In this section, the most recent genome alignment tools are lar or same functionality unlike in other genes [5]. In other case,
introduced. Some of these tools are implemented in parallel en- Alpha tool can be used to predict that genes have near biological
vironment while some others are not. Table I lists the general functionalities even if the detectable similarity is lack. Alpha im-
specification of each tools. plements the standard partial order alignment in their algorithm.
It also tries to alleviate misalignments that progressiveMauve

66
IWBIS 2016 c 2016 IEEE
978-1-5090-3477-2/16/$31.00

TABLE II. MOST RECENT GENOME ALIGNMENT FRAMEWORKS

SEAL Crossbow CloudBurst CloudAligner

Interface CLI CLI CLI CLI
Restrictions to use None None None None
Operating System Unix/Linux Unix/Linux Unix/Linux Unix/Linux
Programming Language Java Java Java Java
Parallelization Map Map Map Map
Reduce Reduce Reduce Reduce
License NA NA Artistic License version 2.0 GNU General Public License
version 2.0
Computer Skills Advanced Advanced Advanced Advanced
Stability Stable Stable Stable Stable
a.
Source: https://omictools.com/

might get since the combinatorial properties of alignments in BLASTN works in parallel environment using several CPU. As
progressiveMauve are linearized [5]. a results it shows a much better performance that the original
MegaBLAST by speeding up 10-20 times over the Mega-
C. GEnome Comparison with K-mers Out-of-core (GECKO) BLAST.
GECKO’s work focuses on tackling the computational inef-
ficiency due to huge data. GECKO implements out-of-core al- G. PaSWAS
gorithm to overcome the issue in pairwise alignment. As a result Parallel Smith-Waterman Algorithm Software (PaSWAS)
of comparing this tool with several state-of-the-art tools includ- utilizes the power of Smith-Waterman (SW) algorithm in align-
ing ProgressiveMauve, it is proven that GECKO is quite more ing genome sequences. PaSWAS tackles almost all of the lacks
efficient in memory consuming. In its application, GECKO uti- of original SW, which is either less accurate due to statistical
lize cloud computing using OpenStack. Due to this ability, issues, slow in computing large-scale data, and so on. By paral-
GECKO can be run not only in clusters but also in simple desk- lelizing the implementation of SW algorithm using NVDIA-
top PCs. Furthermore, GECKO is a modular tool, thus, users can based general purpose GPUs (GPGPUs), PaSWAS is able to
add some useful features such as K-mer frequency calculation, perform an accurate and fast alignment process regardless the
pre-visualization monitoring tools, and so on [6]. length of the sequences [11]. Yet currently, it can only handle
local alignment [12].
D. ANItools
ANItools calculates the average nucleotide identity (ANI) From aforementioned above, most of the genome alignment
score of a bacterial species then identifies the bacteria using this tools using FASTA format as their input. Meanwhile, it has var-
score. ANItools basically uses the pre-existing and widely used ious format for the output. This output format variety is depend-
BLAST tool to do the alignment of genomes. This further con- ing on the objective of each tools such as for phylogenetic tree
tributes also in classification of bacterial species. ANItools pro- analysis, protein function analysis, profile analysis, and etc. In
vides both desktop using CLI and web app using friendly GUI addition, these tools mainly focus on proposing novel methods
for user with more basic computer skills. Yet, ANItools is only on memory access for reducing the parallel computing compu-
available for academic uses only and restricted by GNU General tational cost.
Public License License version 3.0 [7]. Recent genome alignment tools obstacle is the computa-
E. REcursiVe Exact-matching ALigner (REVEAL) tional cost. In few next years, many researchers will be focus on
building an effective and efficient sequence alignment algo-
REVEAL improves the de novo sequence alignment by in- rithm. Furthermore, many algorithms also will be proposed to
troducing a de novo approach to infer transcriptome that allows find the significant fragment in a sequence. Sequence alignment
multiple alignment and constructs alignments recursively. As re- computation in a significant genome fragment will be reducing
sults, REVEAL can be used to compute high resolution ge- the computational cost drastically.
nomes, to detect structural variations, and most importantly, it
can be used to construct MSA without involving reference ge- IV. GENOME ALIGNMENT FRAMEWORKS
nomes [8].
In this section, some genome alignment frameworks are de-
F. HS-BLASTN scribed. SEAL, Crossbow, CloudBurst, and CloudAligner are of
High speed nucleotide-nucleotide basic local alignment widely used frameworks. These frameworks work with MapRe-
search tool (HS-BLASTN) proposes a high speed computation duce mechanism of Hadoop since they are built in Hadoop en-
of nucleotide sequences alignment [9]. HS-BLASTN retrofits vironment [13].
the previously existing MegaBLAST by adding a novel lookup Hadoop consists of mainly two components, i.e. HDFS (Ha-
table using FMD-index [10]. This improvement makes align- doop Distributed File System) and MapReduce. The former is
ment process more efficient and less time consuming. HS- designed to store very large data in a distributed environment

67
IWBIS 2016 c 2016 IEEE
978-1-5090-3477-2/16/$31.00

ůƵƐƚĞƌ
DĂƉZĞĚƵĐĞ
^Žƌƚ Θ
^ŚƵĨĨůĞ
EŽĚĞ ϭ DĂƉ ZĞĚƵĐĞ

EŽĚĞ Ϯ ^Žƌƚ Θ
^ŚƵĨĨůĞ
DĂƉ ZĞĚƵĐĞ

^Žƌƚ Θ
EŽĚĞ Ŷ ^ŚƵĨĨůĞ
DĂƉ ZĞĚƵĐĞ
WƌĞƉƌŽĐĞƐƐĞĚ
ZĞƐƵůƚ ƚŽ ƐƚŽƌĞĚ ŝŶ
'ĞŶŽŵĞ ^ĞƋƵĞŶĐĞ ^Žƌƚ Θ
ůŝŐŶŵĞŶƚ ,&^
ĂƚĂ ƵƐŝŶŐ ^ŚƵĨĨůĞ
'ĞŶŽŵĞ
ůŝŐŶŵĞŶƚ
WĂĐŬĂŐĞ

Figure 2. Common architecture of genome alignment process in Hadoop framework by utilizing MapReduce

evolving several computers as cluster nodes while the latter is to long sequences whereas CloudBurst is limited only performs
execute the data processing. computation of short reads.
MapReduce is one of the main components in the Hadoop Crossbow is another framework that implements MapRe-
big data computing system other than HDFS. MapReduce con- duce based on Hadoop [16]. This framework gains much popu-
tributes to run the jobs in Hadoop. This is the framework where larity since it shows quite precise results i.e. almost 100% when
data parallelizing paradigm takes its place in. There are two ma- simulates human chromosome 22, simulated SNPs. Crossbow
jor phases datasets should undergo in MapReduce, i.e. Map and uses Bowtie to align reads. Bowtie [17] is also an alignment (and
Reduce. mapping) package available to align short reads. Bowtie can
minimize memory usage by indexing the genome with Burrows-
In the Map phase, a set of data records is collected from Wheeler (BW) index. BW Aligner (BWA) [18] is another tool
HDFS and partitioned into several tuples. A tuple consists of a that implements BW algorithms such as BWA-backtrack,
records value and its key <k, v>. The keys are not necessarily to BWA-SW and BWA-MEM [19]. BWA-MEM algorithm pro-
be unique. These keys are used to combine the records whose vides more accurate long-sequence alignment compared to the
characteristics are similar i.e. has the same key. The combined other algorithms.
tuples are the output of this first phase of MapReduce. The out-
put merged tuples from the Map phase are stored in buffer before Another example of short read alignment tool is SEAL.
being taken for shuffling. SEAL is also capable to remove duplicates during alignments. It
gives consistent results as yielded by BWA even with less time
The map output needs to undergo the shuffle process before consuming without performing duplicate removal compared to
entering the Reduce phase. In the shuffle process, the merged BWA [20].
tuples are sorted by the key. This process will make the data with
the same values grouped together. The next step is transferring V. CONCLUSION
these grouped data to the Reduce phase. In the Reduce phase,
since the data are already organized, demanded operations such This article provides information about packages and frame-
as counting, compression, etc. can be done. Afterwards, the out- works used in genome sequence alignment. ProgressiveMauve,
put from the Reduce phase is stored back in HDFS. Alpha, GECKO, ANItools, REVEAL, HS-BLASTN, PaSWAS
are few of numerous tools that offers both accurate and fast
CloudBurst, CloudAligner, Crossbow, and SEAL are re- alignment performance. As these tools come up recently, they
viewed in this section. These frameworks utilize MapReduce in are as or more powerful compared to the preceding tools such as
their common architecture as shown in Figure 2. The general BLAST, Bowtie, BWA, etc. which are implemented in Hadoop-
specifications of these frameworks are listed in Table 2. based frameworks as described in section III. These frameworks
that apply MapReduce are presented to show the implementation
CloudBurst offers a fast computation of millions of short
of alignment packages in Hadoop environment and to show the
reads using a larger remote cloud using 96 cores. Its running
possibility of newly developed tools to be implemented in Ha-
time is linearly scales as the number of processor increases [14].
doop as well.
This is not until CloudAligner comes up with faster and more
accurate results. In the CloudAligner’s architecture, reduce As suggestion, in selecting the most suitable tools or frame-
phase is omitted. By doing so, CloudAligner shows a significant work with our needs, the following criteria can be considered:
improvement [15]. Moreover, CloudAligner is available to read

68
IWBIS 2016 c 2016 IEEE
978-1-5090-3477-2/16/$31.00

• Pairwise or multiple sequence alignment [9] Y. Chen, W. Ye, Y. Zhang and Y. Xu, "High speed BLASTN: an
• Short or long reads accelerated MegaBLAST," Nucleic acids research, vol. 43, no. 16, pp.
7762-7768, 2015.
• Data size (this is to decide whether it is need to use Hadoop
frameworks or not) [10] A. Morgulis, G. Coulouris, Y. Raytselis, T. L. Madden, R. Agarwala and
A. A. Schäffer, "Database indexing for production MegaBLAST
• Implemented algorithm searches," Bioinformatics, vol. 24, no. 16, pp. 1757-1764, 2008.
• Expected output file format
[11] S. A. Manavski and G. Valle, "CUDA compatible GPU cards as efficient
• Programming language hardware accelerators for Smith-Waterman sequence alignment," BMC
• User (to determine whether to use CLI interface or GUI, bioinformatics, vol. 9, no. 2, p. 1, 2008.
desktop or web app)
[12] S. Warris, F. Yalcin, K. J. L. Jackson and J. P. Nap, "Flexible, fast and
accurate sequence alignment profiling on GPGPU with PaSWAS," PloS
VI. ACKNOWLEDGMENT one, vol. 10, no. 4, p. e0122524, 2015.
This study is a part of a grand research namely “Pengem- [13] Q. Zou, X. B. Li, W. R. Jiang, Z. Y. Lin, G. L. Li and K. Chen, "Survey
bangan Sistem Penilaian Kualitas Embrio pada Bayi”. We of MapReduce frame operation in bioinformatics," Briefings in
would like to show our gratitude to the Ministry of Research, bioinformatics, vol. 15, no. 4, p. bbs088, 2013.
Technology and Higher Education of the Republic of Indonesia [14] M. C. Schatz, "CloudBurst: highly sensitive read mapping with
for the support to our research by providing the Penelitian MapReduce," Bioinformatics, vol. 25, no. 11, pp. 1363-1369, 2009.
Unggulan Perguruan Tinggi (PUPT) Research Grant No: 0523/ [15] T. Nguyen, W. Shi and D. Ruden, "CloudAligner: A fast and full-
UN2.R12/HKP. 05.00/2015. featured MapReduce based tool for sequence mapping," BMC research
notes, vol. 4, no. 1, p. 1, 2011.
[16] B. Langmead, M. C. Schatz, J. Lin, M. Pop and S. L. Salzberg,
REFERENCES "Searching for SNPs with cloud computing," Genome biology, vol. 10,
no. 11, p. 1, 2009.
[1] M. I. Sani, M. E. Suryana, M. A. Akbar, A. Noviyanto, W. Jatmiko and [17] N. A. Fonseca, J. Rung, A. Brazma and J. C. Marioni, "Tools for mapping
A. M. Arymurthy, "Performance Analysis of ECG Signal Compression high-throughput sequencing data," Bioinformatics, vol. 28, no. 24, p.
using SPIHT," International Journal on Smart Sensing and Intelligent 3169–3177, 2012.
Systems, vol. 6, no. 5, 2013.
[18] H. Li and R. Durbin, "A Highly Parallel Next-Generation DNA
[2] Scitable, "Scitable by Nature Education," 2014. [Online]. Available: Sequencing," in IEEE International Conference on Bioinformatics and
http://www.nature.com/scitable/definition/single-nucleotide- Biomedicine (BTBM), 2015.
polymorphism-snp-295. [Accessed 3 September 2016].
[19] H. Li, "Aligning sequence reads, clone sequences and assembly contigs
[3] V. Marx, "Biology: The big challenges of big data," Nature, vol. 498, no. with BWA-MEM," arXiv preprint arXiv:1303.3997, 2013.
7453, pp. 255-260, 2013.
[20] L. Pireddu, S. Leo and G. Zanetti, "SEAL: a distributed short read
[4] A. E. Darling, B. Mau and N. T. Perna, "progressiveMauve: multiple mapping and duplicate removal tool," Bioinformatics, vol. 27, no. 15, pp.
genome alignment with gene gain, loss and rearrangement," PloS one, 2159-2160, 2011.
vol. 5, no. 6, 2010.
[21] H. Li and R. Durbin, "Fast and accurate short read alignment with
[5] S. Berard, A. Chateau, N. Pompidor, P. Guertin, A. Bergeron and K. M. Burrows--Wheeler transform," Bioinformatics, vol. 25, no. 14, pp. 1754-
Swenson, "Aligning the unalignable: bacteriophage whole genome 1760, 2009.
alignments," BMC bioinformatics, vol. 17, no. 1, 2016.
[22] O. Torreno and O. Trelles, "Breaking the computational barriers of
[6] W. Zhang, Y. Niu, H. Zou, L. Luo, Q. Liu and W. Wu, "Accurate pairwise genome comparison," BMC bioinformatics, vol. 16, no. 1, p. 1,
prediction of immunogenic T-cell epitopes from epitope sequences using 2015.
the genetic algorithm-based ensemble learning," PLos one, vol. 10, no.
5, p. e0128194, 2015. [23] W. Zhang, P. Du, H. Zheng, W. Yu, L. Wan and C. Chen, "Whole-
genome sequence comparison as a method forimproving bacterial
[7] N. Han, Y. Qiang and W. Zhang, "ANItools web: a web tool for fast species definition," Applied Microbiology, Molecular and Cellular
genome," Database, vol. 2016, 2016. Biosciences Research Foundation, vol. 60, no. 2, pp. 75-78, 2014.
[8] J. Linthorst, M. Hulsman, H. Holstege and M. Reinders, "Scalable multi
whole-genome alignment using recursive exact matching," bioRxiv, p.
011715, 2015.

69
IWBIS 2016 c 2016 IEEE
978-1-5090-3477-2/16/$31.00

Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Second Year BSC MLT NOTES
No ratings yet
Second Year BSC MLT NOTES
193 pages
Honor Bio-You Are Not The Mother of Your Children
100% (2)
Honor Bio-You Are Not The Mother of Your Children
6 pages
Proteins: Proteins Are Large Biomolecules, or Macromolecules, Consisting of One or
No ratings yet
Proteins: Proteins Are Large Biomolecules, or Macromolecules, Consisting of One or
15 pages
Adult Chronic Rhinosinusitis: Nature Reviews Disease Primers December 2020
No ratings yet
Adult Chronic Rhinosinusitis: Nature Reviews Disease Primers December 2020
20 pages
Bioinformatics A Practical Guide To Next Generation Sequencing Data
100% (1)
Bioinformatics A Practical Guide To Next Generation Sequencing Data
349 pages
Vargas-Straube Et Al., 2020
No ratings yet
Vargas-Straube Et Al., 2020
15 pages
Cellular Basis of Life (Simplified)
No ratings yet
Cellular Basis of Life (Simplified)
64 pages
6 Biology Vectors
No ratings yet
6 Biology Vectors
16 pages
Algorithms in Computational Biology and Genome PPT2
No ratings yet
Algorithms in Computational Biology and Genome PPT2
9 pages
Hematopoesis
No ratings yet
Hematopoesis
61 pages
Thrombosis
No ratings yet
Thrombosis
15 pages
F4 SBP MRSM Terengganu Paper 2 Bio Q and A
No ratings yet
F4 SBP MRSM Terengganu Paper 2 Bio Q and A
369 pages
BT Time Table-AY 2024-25 ODD
No ratings yet
BT Time Table-AY 2024-25 ODD
3 pages
Sequence Analysis Primer, 1st Edition Full Download
100% (8)
Sequence Analysis Primer, 1st Edition Full Download
17 pages
May 23 MS
No ratings yet
May 23 MS
28 pages
AQA A Level Biology November 2021 Paper 2
No ratings yet
AQA A Level Biology November 2021 Paper 2
32 pages
Bioinformatics Intro
No ratings yet
Bioinformatics Intro
69 pages
Knauth 1998
No ratings yet
Knauth 1998
2 pages
Skin
No ratings yet
Skin
22 pages
Laboratory Services-1
No ratings yet
Laboratory Services-1
21 pages
Présentation
No ratings yet
Présentation
11 pages
Anotacion de Genomas
No ratings yet
Anotacion de Genomas
84 pages
Evelyn Work MCB 311
No ratings yet
Evelyn Work MCB 311
6 pages
The Triumphs and Limitations of Computational Methods For scRNA-seq
No ratings yet
The Triumphs and Limitations of Computational Methods For scRNA-seq
14 pages
Latthika
No ratings yet
Latthika
21 pages
Annual Plan - Biology - Igcse Level-1 (2023-24)
No ratings yet
Annual Plan - Biology - Igcse Level-1 (2023-24)
6 pages
The Antibacterial Mechanism of Silver Nanoparticles and Its Application in Dentistry
No ratings yet
The Antibacterial Mechanism of Silver Nanoparticles and Its Application in Dentistry
9 pages
Untitled
No ratings yet
Untitled
6 pages
POA Personal Identification
No ratings yet
POA Personal Identification
21 pages
Biology F2 M-1
No ratings yet
Biology F2 M-1
56 pages
Navigating-Internet-Resources-in-Bioinformatics - PPTX 2
No ratings yet
Navigating-Internet-Resources-in-Bioinformatics - PPTX 2
11 pages
Sequence Analysis Primer 1st Edition ISBN 0195098749, 9780195098747 Full Text Download
No ratings yet
Sequence Analysis Primer 1st Edition ISBN 0195098749, 9780195098747 Full Text Download
16 pages
A068 - Physiological Effect of Mret Activated Water-1
No ratings yet
A068 - Physiological Effect of Mret Activated Water-1
7 pages
Biosensors in Agriculture
100% (2)
Biosensors in Agriculture
496 pages
ISTQB Interview Questions
No ratings yet
ISTQB Interview Questions
4 pages
Ismail H. Bioinformatics. A Practical Guide... Sequencing Data Analysis 2023
No ratings yet
Ismail H. Bioinformatics. A Practical Guide... Sequencing Data Analysis 2023
349 pages
R NGS
No ratings yet
R NGS
29 pages
Sequence Alignment
No ratings yet
Sequence Alignment
29 pages
Bioinformatics Final
No ratings yet
Bioinformatics Final
18 pages
Chapter 3 Heuristic
No ratings yet
Chapter 3 Heuristic
56 pages
Head & Neck Anatomy - Embryology & Pharyngeal Arches - INBDE
No ratings yet
Head & Neck Anatomy - Embryology & Pharyngeal Arches - INBDE
13 pages
Lecture2-DataMining For Bioinformatics
No ratings yet
Lecture2-DataMining For Bioinformatics
7 pages
Nailgrowth and Differentiation: Transer: Andrean Linata
No ratings yet
Nailgrowth and Differentiation: Transer: Andrean Linata
5 pages
Ahmed Taher
No ratings yet
Ahmed Taher
5 pages
4 - 7 Genome Assembly To Annotation - Final
No ratings yet
4 - 7 Genome Assembly To Annotation - Final
92 pages
Bioinformatics
No ratings yet
Bioinformatics
22 pages
Bioinformatic Tools and Resources
No ratings yet
Bioinformatic Tools and Resources
17 pages
Quality Assurance Technical Interview: SQL Questions
No ratings yet
Quality Assurance Technical Interview: SQL Questions
2 pages
BTH 403-BTG407 Lecture 1
No ratings yet
BTH 403-BTG407 Lecture 1
6 pages
Inf1343 2011W Assignment 1
No ratings yet
Inf1343 2011W Assignment 1
4 pages
What Is Ecology PDF
No ratings yet
What Is Ecology PDF
113 pages
Name For Correspondence From PMI: Mr. Mohammed Hamed
No ratings yet
Name For Correspondence From PMI: Mr. Mohammed Hamed
5 pages
Database Systems: Homework 1 Key: T1.P T2.A
No ratings yet
Database Systems: Homework 1 Key: T1.P T2.A
7 pages
Sample Exam Istqb Foundation Level 2011 Syllabus: International Software Testing Qualifications Board
No ratings yet
Sample Exam Istqb Foundation Level 2011 Syllabus: International Software Testing Qualifications Board
27 pages
Defect Prediction: Using Machine Learning: Kirti Hegde, Consultant Trupti Songadwala, Senior Consultant Deloitte
No ratings yet
Defect Prediction: Using Machine Learning: Kirti Hegde, Consultant Trupti Songadwala, Senior Consultant Deloitte
13 pages
IJCA Paper-F Ver 28-4-2018
No ratings yet
IJCA Paper-F Ver 28-4-2018
12 pages
Deda Thi Thu THPTQG
No ratings yet
Deda Thi Thu THPTQG
8 pages
What Are The Levels of Testing?
No ratings yet
What Are The Levels of Testing?
17 pages
Togaf Question Bank 3
No ratings yet
Togaf Question Bank 3
2 pages
Togaf Question Bank 6
100% (1)
Togaf Question Bank 6
7 pages
Article BioinformaticsNewToolsAndAppli
No ratings yet
Article BioinformaticsNewToolsAndAppli
15 pages
Bioinformatics and Functional Genomics - Ebook PDF
No ratings yet
Bioinformatics and Functional Genomics - Ebook PDF
51 pages
Bif401 Manual 2023
No ratings yet
Bif401 Manual 2023
27 pages
Bioinformatic Paper WPS Office
No ratings yet
Bioinformatic Paper WPS Office
20 pages
Bioinformatics Past Paper-WPS Office
No ratings yet
Bioinformatics Past Paper-WPS Office
19 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
14 pages
Petroleum Products
No ratings yet
Petroleum Products
5 pages
Clinic System SRS
No ratings yet
Clinic System SRS
14 pages
Sample Exam A Istqb Foundation Level 2018 Syllabus: Released Version 2018
No ratings yet
Sample Exam A Istqb Foundation Level 2018 Syllabus: Released Version 2018
20 pages
Agile Code Quality Metrics: Gil Nahmias, CSP
No ratings yet
Agile Code Quality Metrics: Gil Nahmias, CSP
12 pages
Bioinfo Course Notes M1 2020 DR Mbulli
No ratings yet
Bioinfo Course Notes M1 2020 DR Mbulli
56 pages
Why Deep Learning Is Changing The Way To Approach NGS Data Processing A Review
No ratings yet
Why Deep Learning Is Changing The Way To Approach NGS Data Processing A Review
9 pages
Lecture 1 - Course Information
No ratings yet
Lecture 1 - Course Information
15 pages
Cell Organelles Worksheet
No ratings yet
Cell Organelles Worksheet
6 pages
Computational Validation and Analysis of Semi-Quantitative Data Using In-Silico Approaches
No ratings yet
Computational Validation and Analysis of Semi-Quantitative Data Using In-Silico Approaches
5 pages
Done By, B.Shruthi (11109A067
No ratings yet
Done By, B.Shruthi (11109A067
38 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Plant Biotechnology
No ratings yet
Plant Biotechnology
44 pages
Datamining
No ratings yet
Datamining
15 pages
Bioinformatics Pratical File
No ratings yet
Bioinformatics Pratical File
63 pages
Bioinformatics New Tools and Applications in Life
No ratings yet
Bioinformatics New Tools and Applications in Life
16 pages
Big Data and Genomics
No ratings yet
Big Data and Genomics
17 pages
Data Retrieval
67% (3)
Data Retrieval
17 pages
Complete Test 1
100% (2)
Complete Test 1
17 pages
Introduction To Different Resources of Bioinformatics and Application PDF
No ratings yet
Introduction To Different Resources of Bioinformatics and Application PDF
55 pages
Major Histocompatibility Complex
No ratings yet
Major Histocompatibility Complex
30 pages
Bio Informatics
No ratings yet
Bio Informatics
46 pages
Introduction To Bioinformatics Presentation
No ratings yet
Introduction To Bioinformatics Presentation
13 pages
Bioinformatics
No ratings yet
Bioinformatics
55 pages
Tools in Bioinformatics
100% (1)
Tools in Bioinformatics
17 pages
Bio Tools Booklet
No ratings yet
Bio Tools Booklet
5 pages
tmp168B TMP
No ratings yet
tmp168B TMP
2 pages
Supplementary List of Software For Bioinformatics and Comparative Genomics
No ratings yet
Supplementary List of Software For Bioinformatics and Comparative Genomics
5 pages
Multiple Sequence Alignment Tools: Tutorials and Comparative Analysis
No ratings yet
Multiple Sequence Alignment Tools: Tutorials and Comparative Analysis
19 pages
Future of Genomic Data
No ratings yet
Future of Genomic Data
3 pages
Bioinformatics 2011 Blankenberg 2426 8
No ratings yet
Bioinformatics 2011 Blankenberg 2426 8
3 pages
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
No ratings yet
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
11 pages
Stuart M. Brown-Bioinformatics - A Biologist's Guide To Biocomputing and The Internet-Eaton Publishing Company - Biotechniques Books (2000)
No ratings yet
Stuart M. Brown-Bioinformatics - A Biologist's Guide To Biocomputing and The Internet-Eaton Publishing Company - Biotechniques Books (2000)
189 pages
Maya June Presentation
No ratings yet
Maya June Presentation
29 pages
Deep Learning with Hadoop
From Everand
Deep Learning with Hadoop
Dipayan Dev
No ratings yet
Bioinformatics: Intended Learning Outcomes
No ratings yet
Bioinformatics: Intended Learning Outcomes
9 pages
Bio in For Matics
No ratings yet
Bio in For Matics
17 pages
Will Computers Crash Genomics?
No ratings yet
Will Computers Crash Genomics?
3 pages
Bioinformatics:: Guide To Bio-Computing and The Internet
No ratings yet
Bioinformatics:: Guide To Bio-Computing and The Internet
34 pages
Download
No ratings yet
Download
19 pages
Unit 6 - Bioinformatics
No ratings yet
Unit 6 - Bioinformatics
41 pages
Human Genome Project: Presented By: Vaishali Gade & Sandhya Singh
No ratings yet
Human Genome Project: Presented By: Vaishali Gade & Sandhya Singh
30 pages
Basics of Bioinformatics
100% (7)
Basics of Bioinformatics
99 pages
Application in Establishing Epidemiology and Variability: Genome & Protein " Sequence Analysis Programs"
100% (3)
Application in Establishing Epidemiology and Variability: Genome & Protein " Sequence Analysis Programs"
23 pages

A Survey of Whole Genome Alignment Tools and Frameworks Based On Hadoop'S Mapreduce

Uploaded by

A Survey of Whole Genome Alignment Tools and Frameworks Based On Hadoop'S Mapreduce

Uploaded by

IWBIS 2016 c 2016 IEEE

A Survey of Whole Genome Alignment Tools and

Anom Bowolaksono Budi Wiweko

TABLE I. MOST RECENT GENOME ALIGNMENT PACKAGES

TABLE II. MOST RECENT GENOME ALIGNMENT FRAMEWORKS

SEAL Crossbow CloudBurst CloudAligner

You might also like