0% found this document useful (0 votes)
8 views13 pages

Introduction

The document provides an overview of bioinformatics, including the basics of genetics, sequence data analysis, and advancements in sequencing methods such as Sanger and Next Generation Sequencing. It discusses the challenges and innovations in bioinformatics, including software development and the exponential increase in DNA and protein sequences. Additionally, it highlights various applications and databases related to DNA and protein sequences, emphasizing the importance of sequence comparison and identification of genetic variations.

Uploaded by

ssalah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views13 pages

Introduction

The document provides an overview of bioinformatics, including the basics of genetics, sequence data analysis, and advancements in sequencing methods such as Sanger and Next Generation Sequencing. It discusses the challenges and innovations in bioinformatics, including software development and the exponential increase in DNA and protein sequences. Additionally, it highlights various applications and databases related to DNA and protein sequences, emphasizing the importance of sequence comparison and identification of genetic variations.

Uploaded by

ssalah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Basics Bioinformatics

Sadaf Ambreen
Email: [email protected]
PhD Bioinformatics(Genetics
Beijing Institute of Genomics, UCAS, Beijing, China
IPFP Fellow Bioinformatics, University of Okara
Visiting Researcher , University of Okara,Pakistan
Introduction
Levels of genetics.

A genome is an organism's complete set


of genetic instructions
Levels of biological organization.
Sequence Data Analysis
• It is the process of studying a DNA, RNA or
peptide sequence by analytical methods to
understand its features:

• Function
• Structure
• Evolution

• Sequencing, sequence assembly


• Alignment searching in databases
Advancement of sequencing methods
• Sanger Method
• Optimized in early 1990’s
• Used for sequencing and assembling most of the early
genomes of model organism including Human
• NGS (the Next Generation Sequencing)
• Developed from 2004-2005
• Short reads
• High throughput, low cost ,High computational
complexity
• Single molecular sequencing
• Developed recently 2007-2018
• No PCR requirement, Not perfect
• Single cell RNA-sequencing
Bioinformatics innovations and challenges
• Software and Algorithm development (BLAST, FASTA,
machine learning, etc.)
• Advances in electronic hardware (PC, expanded
storage capacity, etc.)
• Internet (mail servers, FTPs, www)
Challenges
• DNA & Protein Sequences
Exponential increase
• Genome Sequencing
Need for annotation
• Role in drug discovery
Applications
 The comparison of sequences in order to find similarity (homology).
 Identification of intrinsic features of the sequence: post
translational modification sites, gene-structures, distributions of
introns and exons.
 Identification of sequence differences and variations such as point
mutations, SNPs
 Evolution and genetic diversity of sequences and organisms
 Identification of molecular structure from sequence alone.
 Genetic diseases

7
DNA (Nucleotide sequences) databases
They are big databases and searching either one should produce
similar results because they exchange information routinely.

-GenBank (NCBI): http://www.ncbi.nlm.nih.gov

-DDBJ (DNA DataBase of Japan): http://www.ddbj.nig.ac.jp

-TIGR: http://tigr.org/tdb/tgi

-Yeast: http://yeastgenome.org

-E. coli: http://colibase.bham.ac.uk/blast/

- NGDC: National Genomics Data Center, Beijing, China


https://bigd.big.ac.cn/

EBI: European Bioinformatics Institute


EMBL: European Molecular Biology Laboratory
HSSP: Homology-Derived Structures of Proteins
Protein (Amino acid) databases
They are big databases too:
-Swiss-Prot (very high level of annotation)
http://au.expasy.org/

-PIR (protein identification resource) the world's most


comprehensive catalog of information on proteins
http://www.pir.uniprot.org/

Translated databases:
-TREMBL (translated EMBL): includes entries that have
not been annotated yet into Swiss-Prot.
http://www.ebi.ac.uk/trembl/access.html

-GenPept (translation of coding regions in GenBank)

-pdb (sequences derived from the 3D structure


Brookhaven PDB) http://www.rcsb.org/pdb/
Finding homologs
Five main BLAST algorithms
13

You might also like