100% found this document useful (2 votes)

165 views54 pages

Bioinformatics: Nadiya Akmal Binti Baharum (PHD)

This document provides information about the BSM 4301 Bioinformatics course at UPM, including: - The course outcomes which are to choose suitable bioinformatics tools for DNA/protein analysis, apply skills to analyze sequences online, and describe bioinformatics applications. - The course is 3 credit hours and taught by two instructors. - Assessments include tests, lab and project assessments evaluating different professional and learning outcomes. - References include two textbooks on bioinformatics fundamentals and databases.

Uploaded by

Nur Razinah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

165 views54 pages

Bioinformatics: Nadiya Akmal Binti Baharum (PHD)

Uploaded by

Nur Razinah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

BSM 4301

BIOINFORMATICS

Nadiya Akmal binti Baharum (PhD)

[email protected]
010-5754311
COURSE OUTCOMES (CO)
• To choose database and bioinformatics program software suitable
for analysis of DNA and proteins sequences (C4)

• To apply appropriate bioinformatic skills/techniques for analyzing

DNA and protein sequence online. (P4, LS)

• To describe bioinformatics applications in daily life. (A4, LL, TS)

COURSE INFORMATION
• Credit hour: 3 (2+1)
• Instructors:

1. Assoc. Prof. Dr. Adam Leow Thean Chor (Co-Ordinator)

2. Dr. Nadiya Akmal Baharum
COURSE OUTLINE
COURSE OUTLINE
ASSESMENTS
PO1 (Knowledge): Test 1 (15%), Test 2 (15%), Final (30%)
PO2 (Practical skills/Psychomotor skills): Lab Assessment (10%)
PO5 (Social skills and responsibilities): Reflective journal from interview with NSG-related industry/research
institutes (10%) (TBC)
PO7 (Information management and life long learning skills): Leader’s lab report (10%)
PO9 (Leadership): Peer assessment (5%)
PO10 (Numerical skills): Group assignment (5%)

TOTAL : 100%
REFERENCES & TEXTBOOKS

• Pevsner, J. (2015). Bioinformatics and Functional Genomics, 3rd

Edition. Wiley Blackwell Inc.

• Choudhori, S. (2014). Bioinformatics for Beginners: Genes, Genomes,

Molecular Evolution, Database and Analytical Tools. 1st Edition. Oxford:
Academic Press.

• Bioinformatics websites.
Introduction to • What is bioinformatics?

Bioinformatics • Bioinformatics : The BIG Picture

Lecture 01-A • Aims of bioinformatics
Learning Outcomes (LO):

By the end of this lecture, students should be able to:

1. Define bioinformatics, as a field of sciences.

2. Summarise the three perspectives of bioinformatics.
3. Explain the final aims of bioinformatics to complement the study of
biological sciences.
What is
bioinformatics?
THREE WORDS TO DESCRIBE
BIOINFORMATICS

www.menti.com
Code:1163 7806
https://www.menti.com/al2ac9orq9fo
A. What is
bioinformatics?
A field of study that uses computation to
process knowledge from biological data.

• It includes the collection, storage, retrieval, manipulation and

modelling of data for analysis, visualization or prediction through
the development of algorithms and software.
A. What is Bioinformatics?
• Management of information systems (databases) provided through
experimental acquisition of molecular biology, to complement many
practical applications.

Computer
Databases
(software)

Integration Fil the gaps to

answer biological
Experiments
questions

http://www.youtube.com/watch?v=dJrpSvsFXFI
A. What is Bioinformatics?
Another definition adopted by Luscombe et al.:

• a union of biology and informatics: bioinformatics involves

the technology that uses computers for storage, retrieval,
manipulation, and distribution of information related to
biological macromolecules such as DNA, RNA, and proteins.
BIOINFORMATICS
http://www.youtube.com/watch?v=42DJPDb-hRU

8
Scopes of Bioinformatics
Development of computational tools and
1 databases.

Application of these tools and

2 databases in generating biological
knowledge to better understand
living systems.
W h y use bioinformatics?
• Pre-bioinformatics : in vivo, in vitro; Post-bioinformatics : in silico
• Systematic organization for huge amounts of data.
• Collect and integrate to make it accessible and usable
• Effective utilization of all data
• Faster analysis through prediction and simulations
• Shorter time to run analyses simultaneously (automation)
• Drawback :
• The quality of bioinformatics predictions depends on the quality of data and the
sophistication of tools being used.
• Bioinformatics and experimental biology are independent, but complementary
activities.
Current status of genomics data
1st Genome : 1995 (H.influenzae, 1.83 million bases)
Human genome : >3 billion bases (GB), ~20,500 coding genes
Rice genome : 400-430 million bases (MB), ~ 38, 000 coding genes
Chicken genome : ~1.21 billion bases, ~20,000 coding genes
Bovine genome : ~2.7 billion bases, ~21,880 coding genes

Growth of reference sequences Growth of GenBank

C. Bioinformatics :The BIG Picture

Hi!

The cell
DNA, RNA, protein
The organism The tree of life
Central dogma of molecular
biology Genome-wide analysis of RNA and protein Genome analysis

Changes in organism across different 3 major branches of bacteria,

developmental stages, and across archaea and eukaryotes
different regions of the body (multicellular
organisms)
• Central dogma of molecular biology : DNA > RNA > Protein > cellular phenotype
• Bioinformatics : complete collection and utilisation of DNA(genome), RNA(transcriptome), and
protein sequences (proteome) to elucidate protein and gene function .
• Application of computer algorithms and computer databases to molecular and cell biology.

• Broadening perspective from cell-level, to organism level phenotype.

• Gene and protein expression changes throughout different developmental stages or different
region of an organism - in response to intrinsic or environmental signals.
• Utilize a collection of genes/protein products (e.g DNA microarrays) to explain changes through
developmental time, changes across body regions, and changes in a variety of physiological or
pathological states.

• DNA sequence analysis data in bioinformatic databases is accumulating for over 150,000
different organisms.
• Complete genome sequences : help categorize organisms into three major branches in the
Treeof Life : bacteria, archaea, and eukaryotes
• Fundamental unity of life and comparative genomics : learn how chromosomes evolved through
duplications, deletions, and rearrangements.
Food for thought...
The cow genome is comprised of a sequence of 2.86 billion
letters (2,860,000,000 bases) - able to fill millions of pages of
a normal book.

How can you detect anomalies (at the gene level) between
cows, or between cows possessing defective phenotypes?

make sense of the letters?

what gene corresponds to a particular protein?
what sequence corresponds to a specific gene?
D. Aims of bioinformatics
1. To store the biological data organized in a database
2. To develop tools and resources to aid in analysis of data.
3. To analyze and interpret accumulated data in a biologically
meaningful manner.
1. Store the biological data organized in a database
• Database is used to store and organize data
• allow an easy access to existing data and submit new entries

• Data are annotated to assign their functional characteristics

• Prevent redundancy and multiplicity of similar data
• Identify gene and proteins by sequence and structure similarity
• Orthologs – gene in different species, evolved from a common ancestral gene. Usually
retains the same function.
• Paralogs – gene duplication within a genome, evolved to distinct protein but related
function (sometimes do not).
• Analogs – different protein sequences, but similar structures.

• Databases must be able to correlate between different hierarchies of information, e.g;

• Genbank – gene and protein sequences
• Protein Data Bank – 3D macromolecules
2. Develop tools and resources to aid in analysis of data.

Homology searching (BLAST)

Sequence alignment (ClustalW)
Primer design (Primer3)
Phylogenetic tree (MEGA 5.0)
RNA structure modeling (mfold)
Protein structure modeling (PSIPREP, Swiss Model)
Signal peptide prediction (SignalP)
Physiochemical properties (ProtParam)
Transmembrane prediction (TMHMM)
Promoter prediction (Neural Network Promoter Prediction)
Many more….
3. Analyze and interpret the accumulated data in a
biologically meaningful manner

Structure analysis Sequence analysis Function analysis

Nucleic acid structure Genome comparison Metabolic pathway
prediction modeling
Phylogeny
Protein structure Gene expression profiling
prediction Gene & promoter prediction
Protein interaction
Protein structure Motif discovery prediction
classification
Sequence database Protein subcellular
Protein structure searching localization prediction
comparison
Sequence alignment
Store Access
General flow of
bioinformatic
information

Manipulate Analyze
E. Application of bioinformatics

Rational drug design

Medical therapy
Forensic DNA analysis
Agricultural biotechnology
Rational drug design
How bioinformatics
help to develop
drugs/inhibitors
that can
preferentially bind
to specific proteins
Medical therapy
Genome sequences to detect potential harmful mutation for early diagnosis and effective treatment

The right medicine can be tailored to the right patient based on biomarker-based
diagnosis.
Forensic DNA analysis
DNA sequencing for legal and investigative purposes.
Molecular phylogenetic analysis as evidence in criminal courts.

Yang et al. (2014). Genomics Proteomics Bioinformatics, 12:190-197.

Agricultural biotechnology
Development of new crop varieties with higher productivity

The deployment of genomic selection breeding will help in achieving higher genetic
gains in less time
Pandey etl al. (2016). Front. Plant. Sci. 7:455.
Access to • Publicly accessible databases
Sequence Data • Database operators
• Access to information
and literature • Access to biomedical
information literature

Lecture 01-B
Learning Outcomes (LO)

• To identify the types of data stored in biological databases.

• To analyze the main features of NCBI and other biological databases.

• To access sequence data information in biological databases,and
biomedical literature.
Store Access
General flow of
bioinformatic
information

Manipulate Analyze
WEBSITES FOR BIOINFORMATIC-BASED
REPOSITORIES

• GenBank - National Center for Biotechnology Information (NCBI)

• The European Molecular Biology Laboratory database (EMBL)
- European Bioinformatics Institute (EBI)

• DNA Database of Japan (DDBJ) - National Institute ofGenetics (NIG)

∴Collectively known as the International Nucleotide Sequence

Database (INSD)
Types of available data
• DNA/RNA sequences
• Protein sequence and structure
• Protein function database
• Organism-specific databases
• Molecular pathway
• Scientific literature
• Genomes
The most sequenced organisms in GenBank

August 2018

Note: Bacteria, archeae and viruses are absent from the list
because of relatively small genomes
8
Types of data in GenBank
• A part of large fragment of DNA:
- bacterial artificial chromosome
(BACs)
- yeast artificial chromosome (YACs)
DNA RNA Protein
• A gene:
- Prokaryote: non-coding and coding
regions cDNA
- Eukaryote:regulatoryregions, protein ESTs
Genomic DNA databases UniGene
coding exons and introns
*non-redundant (NR)
• cDNA databases
- RNA converted to more stable cDNA.
• Expressed Sequence Tags (ESTs)
- Partial DNA sequence of a cDNA clone.
- Assume 220,000 human genes ➠300 ESTs to
each gene.
National Center for Biotechnology Information
(NCBI)

https://www.ncbi.nlm.nih.gov/
NCBI key features: ①PubMed
• National Library of
Medicine’s search service.
• >20 million citations
in MEDLINE
- Medical Literature,
Analysis and Retrieval
System Online.
• Links to participating
online journals.

https://pubmed.ncbi.nlm.nih.gov
NCBI key features: ②Entrez
Also known as The Entrez Global Query Cross-
Database Search System.

A search and retrieval system that integrates:

• Scientific literature;
• DNA and protein sequence databases;
• 3D protein structure data;
• Population study datasets; and
• Assemblies of complete genomes.
NCBI key features: ③BLAST

• Basic Local Alignment Search Tool.

• NCBI’s sequence similarity search tool – analysis of DNA and protein
databases.
• Holds approximately 200,000-300,000 searches daily.
• Comprised of programs: nucleotide/protein blast, blastx, tblastx, tblastn.
NCBI key features: ④OMIM

• Online Mendellian Inheritance (in) Man.

• A catalog of all known diseases linked to human genes and genetic disorders.
• Comprehensive characterization of entries; autosomal dominant, autosomal
recessive, X-linked, mode of inheritance, phenotype, etc.
How do you start looking?
You begin with a query search:
Name of a specific sequence, or;
Information from literature (accession number):
X02775 GenBank genomic DNA sequence
DNA { NT_030059 Genomic contig
Rs7079946 dbSNP (single nucleotide polymorphism)

• DNA and protein sequences are tagged with accession numbers.

• Examples from literature: AY260764.3 - T1 lipase, 3rd version.

• Accession number: a 4-12 string of numbers and/or

alphabetics associated with a molecular sequence record. Like a barcode!

• Can tell whether entry contains nucleotide or protein data.

• One typical molecule can contain many accession numbers- ESTs and DNA
fragment matching that particular molecule.

• Accession numbers of molecules have different formats according to different

databases.
• NCBI - also assigns GI numbers; unique
sequence identification numbers to a
sequence within a record.
E.g NM_000518.4 = human β-globin DNA sequence, GI:28302128.
Suffix [4] refers to version number. But NM_000518.3 has a different
GI: 13788565.

• Many of sequence entries may contain errors,

discrepancies derived from comparison
between mRNA and genomic data.

• So how do you assess quality of a sequence

or entry?
HEADER

A sequence file in
GenBank/GenPept
format

FEATURE

SEQUENCE
The Reference Sequence ( RefSeq ) Project
(Accessible through NCBI main page)

• Goal: To provide best representative sequence for each normal, non-mutated transcript
produced by a gene, and normal protein product.
• One RefSeq entry per given gene or gene product, OR several RefSeq entries - splice
variants or distinct loci.
• RefSeq best representative sequences: provide an expertly curated accession number
that corresponds to the most stable, agreed upon “reference” version of a sequence.

• Formats: Complete genome NC_######

Complete chromosome NC_######
Genomic contig NT_######
mRNA (DNA format) NM_######
Protein NP_######
NCBI’s RefSeq project: many accession number formats for genomic, mRNA, protein sequences;
Accession Molecule Method Note
AC_123456 Genomic Mixed Alternate complete genomic
AP_123456 Protein Mixed Protein products; alternate
NC_123456 Genomic Mixed Complete genomic molecules
NG_123456 Genomic Mixed Incomplete genomic regions
NM_123456 mRNA Mixed Transcript products; mRNA
NM_123456789 mRNA Mixed Transcript products; 9-digit
NP_123456 Protein Mixed Protein products;
NP_123456789 Protein Curation Protein products; 9-digit
NR_123456 RNA Mixed Non-coding transcripts
NT_123456 Genomic Automated Genomic assemblies
NW_123456 Genomic Automated Genomic assemblies
NZ_ABCD12345678 Genomic Automated Whole genome shotgun data
XM_123456 mRNA Automated Transcript products
XP_123456 Protein Automated Protein products
XR_123456 RNA Automated Transcript products
YP_123456 Protein Auto. & Curated Protein products
ZP_12345678 Protein Automated Protein products
UniGene: an NCBI organized resource to describe
where genes are expressed (i.e. from which library)
and how abundantly

DNA RNA protein

complementary DNA Cluster of sequences

(cDNA)

UniGene One gene

HomologoGene:
an excellent NCBI resource
that groups homologous
eukaryotic genes

https://www.ncbi.nlm.nih.gov/homologene/?term=68066
Access to Biomedical Literature

• Pubmed - NCBI gateway to MEDLINE.

• MEDLINE contains bibliographic citations and author abstracts
from over 4,600 journals published in the US, and in 70 countries.
• Has >20 million records dating back to the 1950s.
PubMed search strategies
• Use boolean queries (capitalize AND, OR, NOT)
lipocalin AND disease

• Try using limits (see Advanced search)

• There are links to ﬁnd Entrez entries and external

resources
1 AND 2 1 2 lipocalin AND disease
(504 results)

1 OR 2 1 2 lipocalin OR disease
(2,500,000 results)

1 NOT 2 1 2 lipocalin NOT disease

(2,370 results)

Bif401 Highlighted Subjective Handouts by BINT - E - HAWA
No ratings yet
Bif401 Highlighted Subjective Handouts by BINT - E - HAWA
222 pages
(Basic Life Sciences 16) A. Garcia-Bellido (Auth.), O. Siddiqi, P. Babu, Linda M. Hall, Jeffrey C. Hall (Eds.) - Development and Neurobiology of Drosophila-Springer US (1980)
No ratings yet
(Basic Life Sciences 16) A. Garcia-Bellido (Auth.), O. Siddiqi, P. Babu, Linda M. Hall, Jeffrey C. Hall (Eds.) - Development and Neurobiology of Drosophila-Springer US (1980)
487 pages
Dna Manip
100% (1)
Dna Manip
251 pages
BIOINFORMATICS Chapter 1 3rd Sem
100% (1)
BIOINFORMATICS Chapter 1 3rd Sem
44 pages
Catering Management System Project Report
50% (4)
Catering Management System Project Report
88 pages
Unit1 - Bioinformatics (KBT-603)
No ratings yet
Unit1 - Bioinformatics (KBT-603)
91 pages
MolecularCellBiologyhandbook 15-16-0
0% (1)
MolecularCellBiologyhandbook 15-16-0
35 pages
Microbiology Chapter 2
No ratings yet
Microbiology Chapter 2
109 pages
Group # 13
No ratings yet
Group # 13
49 pages
Bioinformatics Unveiled
From Everand
Bioinformatics Unveiled
Joan Melody
No ratings yet
Industrial Biotechnology An Overview
100% (1)
Industrial Biotechnology An Overview
36 pages
Illumina Sequencing Introduction
No ratings yet
Illumina Sequencing Introduction
12 pages
APPLICATION OF BIOINFORMATICS IN MOLECULAR BIOLOGY AND CURRENT RESEACRH-Dr. Ruchi Yadav
No ratings yet
APPLICATION OF BIOINFORMATICS IN MOLECULAR BIOLOGY AND CURRENT RESEACRH-Dr. Ruchi Yadav
105 pages
Lecture Notes in Networks and Systems
No ratings yet
Lecture Notes in Networks and Systems
42 pages
A Practical Guide To Amplicon and Metagenomic Analysis of Microbiome Data
No ratings yet
A Practical Guide To Amplicon and Metagenomic Analysis of Microbiome Data
16 pages
Current Topics in Biotechnology & Microbiology
No ratings yet
Current Topics in Biotechnology & Microbiology
329 pages
Lecture12 Functional Pathway Analysis
No ratings yet
Lecture12 Functional Pathway Analysis
13 pages
QUEUES Docs
No ratings yet
QUEUES Docs
66 pages
Python Data Science Handbook Python Data Science Handbook
0% (1)
Python Data Science Handbook Python Data Science Handbook
5 pages
Bioinformatics
No ratings yet
Bioinformatics
55 pages
Tools in Bioinformatics
100% (1)
Tools in Bioinformatics
17 pages
Labmanual CS 1
No ratings yet
Labmanual CS 1
52 pages
Kinetics Microbial Growth
No ratings yet
Kinetics Microbial Growth
32 pages
Linux and Kernel Component
100% (1)
Linux and Kernel Component
17 pages
The Future of Gene Editing
No ratings yet
The Future of Gene Editing
23 pages
Nextflow Training
No ratings yet
Nextflow Training
71 pages
Biology Lecture, Chapter 5
No ratings yet
Biology Lecture, Chapter 5
84 pages
GTS354 Semester Test 2 Prep
No ratings yet
GTS354 Semester Test 2 Prep
15 pages
Bioinformatics Tutorial
No ratings yet
Bioinformatics Tutorial
12 pages
An Overview of Microbiology: Dr. Thaigar Parumasivam Email: Thaigarp@usm - My
No ratings yet
An Overview of Microbiology: Dr. Thaigar Parumasivam Email: Thaigarp@usm - My
26 pages
Top Down Distribution (TDD) - CO-PA - SAP
No ratings yet
Top Down Distribution (TDD) - CO-PA - SAP
23 pages
Bio in For Matics
No ratings yet
Bio in For Matics
26 pages
DBT Skill Dev Scheme Details
100% (2)
DBT Skill Dev Scheme Details
2 pages
Bioinformatics Is The Inter-Disciplinary Branch of Biology Which Merges Computer Science, Mathematics and Engineering To Study The Biological Data
No ratings yet
Bioinformatics Is The Inter-Disciplinary Branch of Biology Which Merges Computer Science, Mathematics and Engineering To Study The Biological Data
26 pages
Nextflow in Bioinformatics Executors Performance - 2023 - Future Generation Co
No ratings yet
Nextflow in Bioinformatics Executors Performance - 2023 - Future Generation Co
12 pages
Pairwise Sequence Alignment
No ratings yet
Pairwise Sequence Alignment
12 pages
HISTORY OF MICROBIOLOGY - DR Chileshe Lukwesa
No ratings yet
HISTORY OF MICROBIOLOGY - DR Chileshe Lukwesa
30 pages
List of Biological Databases
100% (1)
List of Biological Databases
8 pages
TNPSC Group 1 Main Exam Question Paper
No ratings yet
TNPSC Group 1 Main Exam Question Paper
5 pages
Metagenomic Shotgun Seq Learning Progress
No ratings yet
Metagenomic Shotgun Seq Learning Progress
19 pages
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
No ratings yet
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
50 pages
Introduction To Databases
No ratings yet
Introduction To Databases
7 pages
Ufgs 01 33 16.00 10 Design Data (Design After Award)
No ratings yet
Ufgs 01 33 16.00 10 Design Data (Design After Award)
58 pages
PFAM Database
No ratings yet
PFAM Database
22 pages
Lecture 1-2 Intro
No ratings yet
Lecture 1-2 Intro
24 pages
SCADA Lab by Halvorsen
No ratings yet
SCADA Lab by Halvorsen
103 pages
Manohar Malgonkar - Google Search
No ratings yet
Manohar Malgonkar - Google Search
7 pages
1-Udacity Enterprise Syllabus Data Architect nd038
No ratings yet
1-Udacity Enterprise Syllabus Data Architect nd038
15 pages
Genetic Circuit Automation - Alec Nielsen
No ratings yet
Genetic Circuit Automation - Alec Nielsen
13 pages
Bioinformatics: Intended Learning Outcomes
No ratings yet
Bioinformatics: Intended Learning Outcomes
9 pages
Success Stories of Agricultural Biotechnology in India: Presented By: Madhu Bai Meena Presented To: Dr. Deepak Jain
No ratings yet
Success Stories of Agricultural Biotechnology in India: Presented By: Madhu Bai Meena Presented To: Dr. Deepak Jain
14 pages
Thara
No ratings yet
Thara
4 pages
Unit 5-Introduction To Biological Databases
No ratings yet
Unit 5-Introduction To Biological Databases
14 pages
Bacterial Ultra Structure
No ratings yet
Bacterial Ultra Structure
31 pages
Fuel Biomass Biomass Heating Systems Greenhouse Gas Energy Security
No ratings yet
Fuel Biomass Biomass Heating Systems Greenhouse Gas Energy Security
20 pages
Nagarjuna Hadoop Resume
No ratings yet
Nagarjuna Hadoop Resume
7 pages
Context-Dependent Control of Behavior in Drosophila PDF
No ratings yet
Context-Dependent Control of Behavior in Drosophila PDF
9 pages
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
No ratings yet
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
9 pages
CRISPR-Cas: Biology, Mechanisms and Relevance: Review
No ratings yet
CRISPR-Cas: Biology, Mechanisms and Relevance: Review
12 pages
Syllabus of Bio-Informatics, PUCC
No ratings yet
Syllabus of Bio-Informatics, PUCC
14 pages
An Overview On Gene Expression Analysis: Dr. R. Radha, P. Rajendiran
No ratings yet
An Overview On Gene Expression Analysis: Dr. R. Radha, P. Rajendiran
6 pages
Database Search, Alignment Viewer and Genomics Analysis Tools: Big Data For Bioinformatics
No ratings yet
Database Search, Alignment Viewer and Genomics Analysis Tools: Big Data For Bioinformatics
12 pages
Lesson Plan Cycle of Infection
No ratings yet
Lesson Plan Cycle of Infection
7 pages
Professional Cloud Architect
No ratings yet
Professional Cloud Architect
9 pages
Kuby Immunology 7th Edition 2013
No ratings yet
Kuby Immunology 7th Edition 2013
1 page
Evolution Check-In Quiz O
No ratings yet
Evolution Check-In Quiz O
2 pages
Fasta and Blast
No ratings yet
Fasta and Blast
3 pages
CBE 647 Lesson Plan - Sept 2017
No ratings yet
CBE 647 Lesson Plan - Sept 2017
3 pages
CH 9 11EM MCQ
No ratings yet
CH 9 11EM MCQ
9 pages
WQD7005 (Alternative Assessment)
100% (1)
WQD7005 (Alternative Assessment)
4 pages
Kollu Hemanth - Java Resume
No ratings yet
Kollu Hemanth - Java Resume
5 pages
International Journal of Computer Science & Information Security
No ratings yet
International Journal of Computer Science & Information Security
192 pages
4 - Discretization and Concept Hierarchy
No ratings yet
4 - Discretization and Concept Hierarchy
26 pages
Normalization
No ratings yet
Normalization
177 pages
NLP File
No ratings yet
NLP File
21 pages
NetAct Plan Editor 17.8.3 CN
No ratings yet
NetAct Plan Editor 17.8.3 CN
6 pages
Migrating A Twotier Application To Azure A Handson Walkthrough of Azure Infrastructure Platform and Container Services 1st Ed Peter de Tender Download
No ratings yet
Migrating A Twotier Application To Azure A Handson Walkthrough of Azure Infrastructure Platform and Container Services 1st Ed Peter de Tender Download
83 pages
Practical Program List WITH SOLUTIONS
No ratings yet
Practical Program List WITH SOLUTIONS
16 pages
EXP DST 20p A4 Proof
No ratings yet
EXP DST 20p A4 Proof
20 pages
AK - STATISTIKA - 02 - Describing Data (Cont.)
No ratings yet
AK - STATISTIKA - 02 - Describing Data (Cont.)
47 pages
Mapa Tipo de Datos
No ratings yet
Mapa Tipo de Datos
1 page
4Gls Sap R/2: Sap Systems and Landscapes
No ratings yet
4Gls Sap R/2: Sap Systems and Landscapes
19 pages
Eswar Muthu Marklogic
No ratings yet
Eswar Muthu Marklogic
11 pages
UGRD-ITE6100B Fundamentals of Database System FINAL EXAM
No ratings yet
UGRD-ITE6100B Fundamentals of Database System FINAL EXAM
12 pages
NR Sign Inc Eeg 3840
No ratings yet
NR Sign Inc Eeg 3840
1 page
Design and Implementation of A Modular Student Results Management System For A Senior Secondary School in Old Kampala Senior Secondary School
No ratings yet
Design and Implementation of A Modular Student Results Management System For A Senior Secondary School in Old Kampala Senior Secondary School
25 pages
Final Report & Demo Evaluation Form ITS432 - ICT450 - v1
No ratings yet
Final Report & Demo Evaluation Form ITS432 - ICT450 - v1
9 pages
It-222 Reviewer
No ratings yet
It-222 Reviewer
3 pages
3RT19755AF31 Datasheet en
No ratings yet
3RT19755AF31 Datasheet en
2 pages
CC 4057
No ratings yet
CC 4057
2 pages

Bioinformatics: Nadiya Akmal Binti Baharum (PHD)

Uploaded by

Bioinformatics: Nadiya Akmal Binti Baharum (PHD)

Uploaded by

BSM 4301

Nadiya Akmal binti Baharum (PhD)

• To apply appropriate bioinformatic skills/techniques for analyzing

• To describe bioinformatics applications in daily life. (A4, LL, TS)

1. Assoc. Prof. Dr. Adam Leow Thean Chor (Co-Ordinator)

• Pevsner, J. (2015). Bioinformatics and Functional Genomics, 3rd

• Choudhori, S. (2014). Bioinformatics for Beginners: Genes, Genomes,

Bioinformatics • Bioinformatics : The BIG Picture

By the end of this lecture, students should be able to:

1. Define bioinformatics, as a field of sciences.

• It includes the collection, storage, retrieval, manipulation and

Integration Fil the gaps to

• a union of biology and informatics: bioinformatics involves

Application of these tools and

Growth of reference sequences Growth of GenBank

Changes in organism across different 3 major branches of bacteria,

• Broadening perspective from cell-level, to organism level phenotype.

make sense of the letters?

• Data are annotated to assign their functional characteristics

• Databases must be able to correlate between different hierarchies of information, e.g;

Homology searching (BLAST)

Structure analysis Sequence analysis Function analysis

Rational drug design

Yang et al. (2014). Genomics Proteomics Bioinformatics, 12:190-197.

• To identify the types of data stored in biological databases.

• To analyze the main features of NCBI and other biological databases.

• GenBank - National Center for Biotechnology Information (NCBI)

• DNA Database of Japan (DDBJ) - National Institute ofGenetics (NIG)

∴Collectively known as the International Nucleotide Sequence

A search and retrieval system that integrates:

• Basic Local Alignment Search Tool.

• Online Mendellian Inheritance (in) Man.

RNA { N91759.1 An expressed sequence tag (1 of 170)

• DNA and protein sequences are tagged with accession numbers.

• Examples from literature: AY260764.3 - T1 lipase, 3rd version.

• Accession number: a 4-12 string of numbers and/or

• Can tell whether entry contains nucleotide or protein data.

• Accession numbers of molecules have different formats according to different

• Many of sequence entries may contain errors,

• So how do you assess quality of a sequence

• Formats: Complete genome NC_######

DNA RNA protein

complementary DNA Cluster of sequences

UniGene One gene

• Pubmed - NCBI gateway to MEDLINE.

• Try using limits (see Advanced search)

• There are links to ﬁnd Entrez entries and external

1 NOT 2 1 2 lipocalin NOT disease

You might also like