0% found this document useful (0 votes)

97 views55 pages

Bioinformatics

Bioinformatics is an applied science that uses computer programs to access molecular biology databanks to make inferences about the information contained within the data archives.

Uploaded by

paretini01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

97 views55 pages

Bioinformatics

Bioinformatics is an applied science that uses computer programs to access molecular biology databanks to make inferences about the information contained within the data archives.

Uploaded by

paretini01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 55

Fondazione Pisana per la Scienza,

January 28, 2019.

Bioinformatics
Paolo Aretini
Senior Researcher
FPS Bioinformatic Lab
[email protected]
Wifi: FPS-corporate
Password: FPScorporate
Bioinformatics

Bioinformatics is an applied science that uses computer programs to access

molecular biology databanks to make inferences about the information
contained within the data archives.
Bioinformatics
Bioinformatics

• This lab introduces you to some of the main databanks, their

applications, and programs.

• You will learn how to retrieve information from the databases,

and analyze the information to obtain useful knowledge about a
DNA sequence and its protein product.
Why it’s useful
• All of the information needed to build an organism is
contained in its DNA. If we could understand it, we
would know how life works.
– Preventing and curing diseases like cancer (which is
caused by mutations in DNA) and inherited diseases.
– Curing infectious diseases (everything from AIDS and
malaria to the common cold). If we understand how a
microorganism works, we can figure out how to block it.
– Understanding genetic and evolutionary relationships
between species
– Understanding genetic relationships between humans.
What are we looking for?
Data & databases
Biologists Collect Lots of Data
• Hundreds of thousands of species to explore; Millions of written articles
in scientific journals; Detailed genetic information:
• gene names
• phenotype of mutants
• location of genes/mutations on chromosomes
• linkage (distances between genes)

High Throughput lab technologies

• PCR
• Next Generation Sequencing(Illumina, IonTorrent Technologies)
• Microarrays (Affymetrix)
• Genome-wide SNP chips / SNP arrays (Illumina)
Main databases by category
Literature
• PubMed: scientific & medical abstracts/citations
• Health
• OMIM: online mendelian inheritance in man
Nucleotide Sequences
• Nucleotide: DNA and RNA sequences
Genomes
• Genome: genome sequencing projects by organism
• dbSNP: short genetic variations
• Ensembl : Ensembl is a genome browser for vertebrate genomes
Genes & Proteins
• Protein: protein sequences
• UniProt: protein sequences and related information
• Several Mutational Database (Cosmic; TP53 Database; BRCA1&2 database)
Pathways
• BioSystems: molecular pathways with links to genes, proteins
• KEGG Pathway: information on main biological pathways
DATABASES
Primary databases

REAL EXPERIMENTAL DATA (raw)

Biomolecular sequences or structures and associated annotation
information (organism, function, mutation linked to disease,
functional/structural patterns, bibliographic etc.)

Secondary databases

DERIVED INFORMATION (analyzed and annotated)

Fruits of analyses of primary data in the primary sources (patterns,
blocks, profiles etc. which represent the most conserved features
of multiple alignments)
GENEBANK DATABASE

• Contains all DNA and protein sequences described in the

scientific literature or collected in publicly funded research
• One can search by protein name to get DNA/mRNA sequences
• The search results could be filtered by species and other
parameters
DATABASES
Fasta format to store sequences
• The FASTA format is now universal for all
databases and software that handles DNA and
protein sequences
• Specifications:
• One header line
• starts with > with a ends with [return]
• Saccharomyces cerevisiae strain YC81 actin (ACT1) gene
• GenBank: JQ288018.1
• >gi|380876362|gb|JQ288018.1| Saccharomyces cerevisiae strain YC81 actin (ACT1) gene, partial cds
TGGCATCATACCTTCTACAACGAATTGAGAGTTGCCCCAGAAGAACACCCTGTTCTTTTGACTGAAG
CTCCAATGAACCCTAAATCAAACAGAGAAAAGATGACTCAAATTATGTTTGAAACTTTCAACGTTCC
AGCCTTCTACGTTTCCATCCAAGCCGTTTTGTCCTTGTACTCTTCCGGTAGAACTACTGGTATTGTTT
TGGATTCCGGTGATGGTGTTACTCACGTCGTTCCAATTTACGCTGGTTTCTCTCTACCTCACGCCATT
TTGAGAATCGATTTGGCCGGTAGAGATTTGACTGACTACTTGATGAAGATCTTGAGTGAACGTGGTT
ACTCTTTCTCCACCACTGCTGAAAGAGAAATTGTCCGTGACATCAAGGAAAAACTATGTTACGTCG
CCTTGGACTTCGAGCAAGAAATGCAAACCGCTGCTCAATCTTCTTCAATTGAAAAATCCTACGAAC
TTCCAGATGGTCAAGTCATCACTATTGGTAAC
BLASTN
BLAST
NCBI Databases contain more than just
DNA & protein sequences

NCBI main portal: http://www.ncbi.nlm.nih.gov/

OMIM Database

• Online Mendelian Inheritance in Man (OMIM)

• ”information on all known mendelian disorders linked to over 12,000
genes”
• “Started at 1960s by Dr. Victor A. McKusick as a catalog of mendelian
traits and disorders”
• Linked disease data
• Links disease phenotypes and causative genes
• Used by physicians and geneticists
OMIM Database
PUBMED

• PubMed is one of the best known database in the whole

scientific community
• Most of biology related literature from all the related fields are
being indexed by this database
• It has very powerful mechanism of constructing search queries
• Many search fields ● Logical operatiors (AND, OR)
• Provides electronic links to most journals
PUBMED
UNIPROT

The mission of UniProt is to provide the scientific

community with a comprehensive, high-quality and
freely accessible resource of protein sequence and
functional information.
UNIPROT
MUTATION DATABASES

• Databases of mutations causing Mendelian disease

or cancer play a crucial role in research, diagnostic
and genetic health care and can play a role in life
and death decisions.
The Human Gene Mutation Database

• The Human Gene Mutation Database (HGMD®)

represents an attempt to collate all known
(published) gene lesions responsible for human
inherited disease
The Human Gene Mutation Database
VarSome

• VarSome's mission is to bring together the global

life sciences community and facilitate the
exchange of information that will lead to new
discoveries.
VarSome
Human Protein Atlas
The Human Protein Atlas is a Swedish-based program
initiated in 2003 with the aim to map all the human proteins
in cells, tissues and organs using integration of various omics
technologies, including antibody-based imaging, mass
spectrometry-based proteomics, transcriptomics and systems
biology. All the data in the knowledge resource is open
access to allow scientists both in academia and industry to
freely access the data for exploration of the human
proteome.
Human Protein Atlas
Revolution of NGS technologies
Sanger Sequencing
Comparison of Technologies
Sanger NGS
Max Output Max Output
57 Kb run (1h) 1,800 Gb run (3.5 days)
Genome Sequencing Cost per Mb (30x)

3
2
Relative throughput of HTT
Next Generation Sequencing emerges with a potential of data
production that will, eventually wipe out conventional HT
technologies in the years coming

NGS

NGS: Too many sequences to be handled in standard hardware

3
3
NGS Technologies
NGS sequencers

Roche 454 FLX+ Illumina GAIIx Life Tech SOLID 5500 Life Tech Ion Torrent Helicos Heliscope

Roche 454 Junior Illumina MiSeq NextSeq Illumina HiSeq Life Tech Ion Proton

Oxford Nanopore Oxford Nanopore Oxford Nanopore

Pacific MinION PromethION Complete Genomics Revolocity
Biosciences RS
GridIon 3
5
NGS sequencers

Roche 454 FLX+ Illumina GAIIx Life Tech SOLID 5500 Life Tech Ion Torrent Helicos Heliscope

Roche 454 Junior Illumina MiSeq NextSeq Illumina HiSeq Life Tech Ion Proton

Oxford Nanopore Oxford Nanopore Oxford Nanopore

Pacific MinION PromethION Complete Genomics Revolocity PacBio Sequel
Biosciences RS
GridIon 3
6
Illumina Sequencers

MiSeq NextSeq 500/550

Max Output Max Read Max Read Max Output Max Read Number Max Read Length
Number Length
15 Gb 120 Gb 400 M 2x150 bp
25 M 2x300 bp
www.illumina.com
Ion S5 and Ion S5XL
Illumina Sequencers

* Max Output Max Read Max Read

HiSeq 2500*/3000/4000 1,000* Gb
Number Length
4,000* M 2x125*
bp

Max Output Max Read Max Read

HiSeq X Ten/ X Five 1,800 Gb
Number Length
6,000 M 2x150 bp
NGS in Genomics
DATA ANALYSIS ISSUES
Storing and analyzing the huge amounts of data generated by
sequencing and other high-throughput technologies require
infrastructure providing high-performance computing and large-
scale storage resources.
Local Framework

CHALLENGES

• Laboratory-hosted servers require investments in informatics support

for configuring and using software;

• Servers are expensive to setup and maintain;

• Enough space and conditions for the equipment ("servers room”).

Local Framework

ADVANTAGE

• Many computational resources available;

• Customization and testing of pipeline with newly developed in-house

software;

• No data transfer;

• No ethical issues;
FPS BIOINFORMATICS

The laboratory was created to analyze and manage Next Generation

Sequencing (NGS) data.

• It provides IT support to the institution (backup and data storage,

software and device installation, database management);

• Bioinformatic and statistical analysis.

• NGS data analysis;

FPS BIOINFORMATICS

NGS technologies are used for many applications:

• rare variant discovery by whole genome resequencing or

targeted sequencing (exome analysis);

• transcriptome profiling of cells, tissues or organisms;

• many more applications (alternative splicing, identification of

epigenetic markers; ChIP-Seq).

NGS technologies in our lab:

GeneStudio S5 (Thermofisher) and NextSeq500 (Illumina)
IT Technologies
The Bioinformatic section is
equipped for intensive calculation
and short and long data storage,
by virtue of collaboration with IT
Center of University of Pisa that
hosts the informatics
infrastructure.

5 Dell Poweredge C8000 and

FC630 with 32 cpu cores, 128 gb
ram and 16 tb of storage (each).

70TB Storage System based on

Dell Equal-logic and PowerVault.

1 Torrent server with 8 cpu cores,

128 GB ram, 27 tb of storage and
2 Nvidia® tesla® gpu
Open source software
• We use mainly “open source” software implemented in Biolinux (Linux
Ubuntu);

• Command line software;

• Software for primary analysis (mapping; variant calling; gene expression and
differential gene expression);

• Software for data visualization (mapping data; gene expression data;

mutation data);

• Pathway and network analysis;

RNA-seq Analysis

Execution Time for 1 sample

• 1-2 hours with 28 threads (depending on data quality and size)

DNA-seq pipeline
Execution time for Illumina NextSeq500 Data

• Genome ad exome analysis with described pipeline implemented in

SeqMule (http://seqmule.openbioinformatics.org/en/latest/)

• 40 hours about to run the pipeline for genome analysis (28 threads,
DELL POWEREDGE C8000)

• 6-15 hours to run the pipeline for exome analysis (28 threads, DELL
POWEREDGE C8000)
Critical Step

Characterize biological meaning of data:

• Variant annotation e filtering;

• Pathway and Network Analysis;

Time consuming!
Critical Step: variant annotation e filtering
Critical Step: Pathway and Network Analysis;
Genetic characterization of Leigh’s Disease case.

Leigh syndrome (LS, OMIM 256000) is a rare heterogeneous progressive

neurodegenerative disorder usually presenting in infancy or early childhood. LS inheritance
is complex since patients may present mutations in mitochondrial DNA (mtDNA) or in
nuclear genes, which predominantly encode for proteins involved in respiratory chain
structure and assembly or in coenzyme Q10 biogenesis;

The proband is a 19-year-old male born from non-consanguineous parents of Caucasian

origin, after a normal pregnancy at 40 weeks of gestation with normal birth
measurements. Both parents and the 18- year-old brother are healthy;

Exome analysis was performed on affected individual (proband) and his relatives (mother,
father and brother);
Genetic characterization of Leigh’s Disease case.

Identification of a rare homozygous missense mutation in ECHS1 (Short-chain enoyl-

CoA hydratase) gene, present in a 19 years-old individual with Leigh Disease.
Genetic characterization of Leigh’s Disease case.

Using CeQer(http://www.ngsbicocca.org/html/ceqer.html), a software able to detect Copy

Number Variation from Exome data, we detected a deletion in an extensive region of
chromosome 10 (from 135120573 to 135187238) involving ZNF511, CALY, PRAP1, FUOM
and ECHS1. This deletion is present in the proband, in his mother and brother but not in
the father

Bioinformatics Notes
No ratings yet
Bioinformatics Notes
40 pages
Multiplex PCR
100% (1)
Multiplex PCR
25 pages
Magazine 2022 Part 4
100% (4)
Magazine 2022 Part 4
76 pages
Worksheet 41 - DNA Genes and Chromosomes
No ratings yet
Worksheet 41 - DNA Genes and Chromosomes
6 pages
TB Final2023 643ef57734b858.643ef57a5a0fb6.40844090
100% (3)
TB Final2023 643ef57734b858.643ef57a5a0fb6.40844090
71 pages
BLAST
100% (1)
BLAST
4 pages
18 - Real-Time QPCR Assay Design Guide - v8
No ratings yet
18 - Real-Time QPCR Assay Design Guide - v8
29 pages
GR 12 LS - Evolution Notes Powerpoint
No ratings yet
GR 12 LS - Evolution Notes Powerpoint
107 pages
Tools in Bioinformatics
100% (1)
Tools in Bioinformatics
17 pages
New Sequencing Technology
100% (4)
New Sequencing Technology
21 pages
GAPIT Manual
No ratings yet
GAPIT Manual
50 pages
Bioinfo - S1 2021 - L7 - Phylogeny - 1 Slide
100% (1)
Bioinfo - S1 2021 - L7 - Phylogeny - 1 Slide
76 pages
Bioinformatics/Computationa L Tools For NGS Data Analysis: An Overview
No ratings yet
Bioinformatics/Computationa L Tools For NGS Data Analysis: An Overview
81 pages
1 - Introduction To Computational Biology
No ratings yet
1 - Introduction To Computational Biology
22 pages
Computational Biology and Bioinformatics
100% (1)
Computational Biology and Bioinformatics
11 pages
ADBT 3 Marker Assisted Breeding
No ratings yet
ADBT 3 Marker Assisted Breeding
48 pages
Bioinformatics Tutorial
No ratings yet
Bioinformatics Tutorial
12 pages
APPSC Lecturer Syllables 15 45
No ratings yet
APPSC Lecturer Syllables 15 45
31 pages
Association Mapping and Its Role in Plant Breeding: Mahendrakumar N. Chaudhari
100% (1)
Association Mapping and Its Role in Plant Breeding: Mahendrakumar N. Chaudhari
28 pages
Bioinformatics Tutorial 2019
No ratings yet
Bioinformatics Tutorial 2019
54 pages
Primer Design For PCR Assignment
100% (1)
Primer Design For PCR Assignment
5 pages
Pairwise Sequence Alignment
No ratings yet
Pairwise Sequence Alignment
12 pages
Sequence Alignments: Felix Sappelt Irina Wagner
100% (1)
Sequence Alignments: Felix Sappelt Irina Wagner
34 pages
CNS and Neurotransmitters
No ratings yet
CNS and Neurotransmitters
50 pages
Mutiplexpcr Primer Design
100% (1)
Mutiplexpcr Primer Design
11 pages
FASTA
No ratings yet
FASTA
33 pages
Multiple Seq Alignment
No ratings yet
Multiple Seq Alignment
36 pages
A Practical Guide To Amplicon and Metagenomic Analysis of Microbiome Data
No ratings yet
A Practical Guide To Amplicon and Metagenomic Analysis of Microbiome Data
16 pages
Cytoscape
No ratings yet
Cytoscape
86 pages
Russian DNA Discoveries
No ratings yet
Russian DNA Discoveries
6 pages
Molecular Assisted Selection in Plant Breeding Programs
No ratings yet
Molecular Assisted Selection in Plant Breeding Programs
48 pages
Genomic DNA Libraries For Shotgun Sequencing Projects
No ratings yet
Genomic DNA Libraries For Shotgun Sequencing Projects
40 pages
202 07 Bioinformatics
No ratings yet
202 07 Bioinformatics
14 pages
Molecular Biology Structure and Dynamics of Genomes and Proteomes 2e by Jordanka Zlatanova
No ratings yet
Molecular Biology Structure and Dynamics of Genomes and Proteomes 2e by Jordanka Zlatanova
732 pages
Phylogenetic Tree Lab (FASTA)
No ratings yet
Phylogenetic Tree Lab (FASTA)
8 pages
Bi0505 Lab
No ratings yet
Bi0505 Lab
102 pages
Heredity MCQ
No ratings yet
Heredity MCQ
4 pages
GA Toolbox in Matlab
No ratings yet
GA Toolbox in Matlab
105 pages
Single Nucleotide Polymorphism Analysis
No ratings yet
Single Nucleotide Polymorphism Analysis
34 pages
Group # 13
No ratings yet
Group # 13
49 pages
Tutorial For Proteome Data Analysis Using The Perseus Software Platform
No ratings yet
Tutorial For Proteome Data Analysis Using The Perseus Software Platform
22 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Chapter 6: Molecular Basis of Inheritance: A Nucleotide
No ratings yet
Chapter 6: Molecular Basis of Inheritance: A Nucleotide
22 pages
Primer Design Exercise
No ratings yet
Primer Design Exercise
34 pages
Lesson 1 緒論
No ratings yet
Lesson 1 緒論
25 pages
Lab Report 2 Bioinformatics
No ratings yet
Lab Report 2 Bioinformatics
17 pages
Fundamentals of Bioinformatics
No ratings yet
Fundamentals of Bioinformatics
40 pages
GWAS
No ratings yet
GWAS
49 pages
Erik Garrison - Iowa Talk 2
No ratings yet
Erik Garrison - Iowa Talk 2
32 pages
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
No ratings yet
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
11 pages
BIOINFORMATICS LAB Report
No ratings yet
BIOINFORMATICS LAB Report
14 pages
Bioinformatics Lab 2 (Evelyn)
No ratings yet
Bioinformatics Lab 2 (Evelyn)
9 pages
An Introduction On Bioinformatics
No ratings yet
An Introduction On Bioinformatics
66 pages
Sequence Alignment: Sequence Alignment Is The Most Important Task in Bioinformatics!
No ratings yet
Sequence Alignment: Sequence Alignment Is The Most Important Task in Bioinformatics!
13 pages
The Future of Gene Editing
No ratings yet
The Future of Gene Editing
23 pages
LSM2241 Practical 4: Introduction To BLAST
No ratings yet
LSM2241 Practical 4: Introduction To BLAST
12 pages
Bioinformatics 1
No ratings yet
Bioinformatics 1
37 pages
Oecd TG 488
No ratings yet
Oecd TG 488
23 pages
Bioinformatics Assignment Topic: Phylogenetics Analysis Softwares
No ratings yet
Bioinformatics Assignment Topic: Phylogenetics Analysis Softwares
12 pages
Lecture12 Functional Pathway Analysis
No ratings yet
Lecture12 Functional Pathway Analysis
13 pages
Chapter 28 - Regulation of Gene Expression
No ratings yet
Chapter 28 - Regulation of Gene Expression
24 pages
Using Genbank and BLAST in The Biology Classroom: Matt Wester
No ratings yet
Using Genbank and BLAST in The Biology Classroom: Matt Wester
9 pages
9790 s12 QP 1
No ratings yet
9790 s12 QP 1
20 pages
Query Sequence 1
No ratings yet
Query Sequence 1
3 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
Next Generation Sequencing: Emerging Clinical Applications and Global Markets
No ratings yet
Next Generation Sequencing: Emerging Clinical Applications and Global Markets
5 pages
PE - PGR Proteins of Mycobacterium Tuberculosis - A Specialized Molecular Task Force at The Forefront of Host-Pathogen Interaction
No ratings yet
PE - PGR Proteins of Mycobacterium Tuberculosis - A Specialized Molecular Task Force at The Forefront of Host-Pathogen Interaction
18 pages
E. Coli:: Good, Bad, & Deadly
No ratings yet
E. Coli:: Good, Bad, & Deadly
16 pages
Blast2Go Tutorial
No ratings yet
Blast2Go Tutorial
31 pages
Plcell v9 7 1197
No ratings yet
Plcell v9 7 1197
14 pages
SHENIBLOG-Class 10 Biology Focus Area Covered Notes (Eng Med) All Chapters 2022
No ratings yet
SHENIBLOG-Class 10 Biology Focus Area Covered Notes (Eng Med) All Chapters 2022
16 pages
Biochem Midterm
No ratings yet
Biochem Midterm
8 pages
Introduction To Next-Generation Sequencing Technology
No ratings yet
Introduction To Next-Generation Sequencing Technology
12 pages
Next Generation
No ratings yet
Next Generation
5 pages
Unit - Guide - BIOL1110 - 2022 - Session 2, in Person-Scheduled-Weekday, North Ryde - v2
No ratings yet
Unit - Guide - BIOL1110 - 2022 - Session 2, in Person-Scheduled-Weekday, North Ryde - v2
12 pages
Introduction To Bioinformatics Lab: 10B17BT571 Core Course Credits: 1 L0T0P2
No ratings yet
Introduction To Bioinformatics Lab: 10B17BT571 Core Course Credits: 1 L0T0P2
3 pages
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
No ratings yet
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
50 pages
Application of Data Science and Bioinformatics in Healthcare Technologies
No ratings yet
Application of Data Science and Bioinformatics in Healthcare Technologies
12 pages
Test 30 - SCIENCE & TECH (FULL LENGTH) - Questions - FINAL
No ratings yet
Test 30 - SCIENCE & TECH (FULL LENGTH) - Questions - FINAL
20 pages
Bio QP Set-1 (PB 2024-25)
No ratings yet
Bio QP Set-1 (PB 2024-25)
7 pages
Blast ND Fasta
No ratings yet
Blast ND Fasta
28 pages
Complete Chemical Synthesis, Assembly, and Cloning of A Mycoplasma Genitalium Genome
No ratings yet
Complete Chemical Synthesis, Assembly, and Cloning of A Mycoplasma Genitalium Genome
8 pages
Guide Autism Where To Begin
No ratings yet
Guide Autism Where To Begin
8 pages
Holland GeneticAlgorithms 1992
No ratings yet
Holland GeneticAlgorithms 1992
9 pages
Guillou 2013 Pr2
No ratings yet
Guillou 2013 Pr2
8 pages
Worksheet 8.1 - BiotechnologyandGMO (AutoRecovered)
No ratings yet
Worksheet 8.1 - BiotechnologyandGMO (AutoRecovered)
3 pages
Lesson Notes by Weeks and Term - Junior Secondary School 2: / / / Family Traits
No ratings yet
Lesson Notes by Weeks and Term - Junior Secondary School 2: / / / Family Traits
5 pages
Guide Sheet For Tics Lab 1 - 4
No ratings yet
Guide Sheet For Tics Lab 1 - 4
17 pages
Next Generation Sequencing
No ratings yet
Next Generation Sequencing
9 pages
Instruction Manual, Iscript Select cDNA Synthesis Kit, Rev B
No ratings yet
Instruction Manual, Iscript Select cDNA Synthesis Kit, Rev B
2 pages

Bioinformatics

Uploaded by

Bioinformatics

Uploaded by

Fondazione Pisana per la Scienza,

January 28, 2019.

Bioinformatics is an applied science that uses computer programs to access

• This lab introduces you to some of the main databanks, their

• You will learn how to retrieve information from the databases,

High Throughput lab technologies

REAL EXPERIMENTAL DATA (raw)

DERIVED INFORMATION (analyzed and annotated)

• Contains all DNA and protein sequences described in the

NCBI main portal: http://www.ncbi.nlm.nih.gov/

• Online Mendelian Inheritance in Man (OMIM)

• PubMed is one of the best known database in the whole

The mission of UniProt is to provide the scientific

• Databases of mutations causing Mendelian disease

• The Human Gene Mutation Database (HGMD®)

• VarSome's mission is to bring together the global

NGS: Too many sequences to be handled in standard hardware

Oxford Nanopore Oxford Nanopore Oxford Nanopore

Oxford Nanopore Oxford Nanopore Oxford Nanopore

MiSeq NextSeq 500/550

* Max Output Max Read Max Read

Max Output Max Read Max Read

• Laboratory-hosted servers require investments in informatics support

• Servers are expensive to setup and maintain;

• Enough space and conditions for the equipment ("servers room”).

• Many computational resources available;

• Customization and testing of pipeline with newly developed in-house

The laboratory was created to analyze and manage Next Generation

• It provides IT support to the institution (backup and data storage,

• Bioinformatic and statistical analysis.

• NGS data analysis;

NGS technologies are used for many applications:

• rare variant discovery by whole genome resequencing or

• transcriptome profiling of cells, tissues or organisms;

• many more applications (alternative splicing, identification of

NGS technologies in our lab:

5 Dell Poweredge C8000 and

70TB Storage System based on

1 Torrent server with 8 cpu cores,

• Command line software;

• Software for data visualization (mapping data; gene expression data;

• Pathway and network analysis;

Execution Time for 1 sample

• 1-2 hours with 28 threads (depending on data quality and size)

• Genome ad exome analysis with described pipeline implemented in

Characterize biological meaning of data:

• Variant annotation e filtering;

• Pathway and Network Analysis;

Leigh syndrome (LS, OMIM 256000) is a rare heterogeneous progressive

The proband is a 19-year-old male born from non-consanguineous parents of Caucasian

Identification of a rare homozygous missense mutation in ECHS1 (Short-chain enoyl-

Using CeQer(http://www.ngsbicocca.org/html/ceqer.html), a software able to detect Copy

You might also like